Now showing 1 - 3 of 3
  • Publication
    A Data Ingestion Procedure towards a Medical Images Repository
    This article presents an ingestion procedure towards an interoperable repository called ALPACS (Anonymized Local Picture Archiving and Communication System). ALPACS provides services to clinical and hospital users, who can access the repository data through an Artificial Intelligence (AI) application called PROXIMITY. This article shows the automated procedure for data ingestion from the medical imaging provider to the ALPACS repository. The data ingestion procedure was successfully applied by the data provider (Hospital Clínico de la Universidad de Chile, HCUCH) using a pseudo-anonymization algorithm at the source, thereby ensuring that the privacy of patients’ sensitive data is respected. Data transfer was carried out using international communication standards for health systems, which allows for replication of the procedure by other institutions that provide medical images. Objectives: This article aims to create a repository of 33,000 medical CT images and 33,000 diagnostic reports with international standards (HL7 HAPI FHIR, DICOM, SNOMED). This goal requires devising a data ingestion procedure that can be replicated by other provider institutions, guaranteeing data privacy by implementing a pseudo-anonymization algorithm at the source, and generating labels from annotations via NLP. Methodology: Our approach involves hybrid on-premise/cloud deployment of PACS and FHIR services, including transfer services for anonymized data to populate the repository through a structured ingestion procedure. We used NLP over the diagnostic reports to generate annotations, which were then used to train ML algorithms for content-based similar exam recovery. Outcomes: We successfully implemented ALPACS and PROXIMITY 2.0, ingesting almost 19,000 thorax CT exams to date along with their corresponding reports.
  • Publication
    Multicategory SVMs by minimizing the distances among convex-hull prototypes
    (2008-11-10) ;
    Concha, Carlos
    ;
    Candel, Diego
    ;
    ;
    Moraga, Claudio
    In this paper, we study a single objective extension of support vector machines for multicategory classification. Extending the dual formulation of binary SVMs, the algorithm looks for minimizing the sum of all the pairwise distances among a set of prototypes, each one constrained to one of the convex-hulls enclosing a class of examples. The final discriminant system is built looking for an appropriate reference point in the feature space. The obtained method preserves the form and complexity of the binary case, optimizing just one convex objective function with m variables and 2m+K constraints, where m is the number of examples and K the number of classes. Non-linear extension are straightforward using kernels while soft margin versions can be obtained by using reduced convex hulls. Experimental results in well-known UCI benchmarks are presented, comparing the accuracy and efficiency of the proposed approach with other state-of-the-art methods.
  • Publication
    InverseTime: A self-supervised technique for semi-supervised classification of time series
    (2024-01-01)
    Goyo, Manuel Alejandro
    ;
    ;
    Valle, Carlos
    Time series classification (TSC) is a fundamental and challenging problem in machine learning. Deep learning models typically achieve remarkable performance in this task but are constrained by the need for vast amounts of labeled data to generalize effectively. In this paper, we present InverseTime, a method that addresses this limitation by incorporating a novel self-supervised pretext task into the training objective. In this task, the training time series are first considered both in their original chronological order and in their reversed state. Then, the model is trained to recognize if time inversion was or was not applied to the input case. We found that this simple task actually provides a supervisory signal that significantly aids model training when explicit category labels are scarce, enabling semi-supervised TSC. Through comprehensive experiments on twelve diverse time-series datasets, spanning different domains, we demonstrate that our method consistently outperforms prior approaches, including various consistency regularization methods. These results show that self-supervision is a promising approach to circumvent the annotation bottleneck in time series applications.