BiCA-small / README.md
chungimungi's picture
Update README.md
3e747a6 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:129971
  - loss:MultipleNegativesRankingLoss
base_model: thenlper/gte-small
widget:
  - source_sentence: >-
      Integrated health care for infectious diseases and non-communicable
      diseases in low-and middle-income countries
    sentences:
      - >-
        The purposes of this study were to create a new flow-chart of
        prehospital electrocardiography (ECG)-transmission, evaluate its
        predictive ability for ST-elevation myocardial infarction (STEMI) and
        shorten door-to-balloon time (DTBT). The new transmission flow-chart was
        created using symptoms from previous medical records of STEMI patients.
        A total of 4090 consecutive patients transferred emergently to our
        hospital were divided into two groups: those in ambulances with an
        ECG-transmission device with the new flow-chart (ECGT-FC) and those
        transferred without an ECG-transmission device (non-ECGT) groups. A
        STEMI group comprising walk-in patients during the same period was used
        as a control group. The predictive ability of STEMI and the
        effectiveness of shortening the DTBT by the new flow-chart of
        ECG-transmission was evaluated. In the ECGT-FC group, the prevalence of
        STEMI in the ECG-transmission by the new flow-chart were significantly
        higher than in the non-ECG-transmission patients (6.71% vs. 0.19%;
        p<0.001). The sensitivity and specificity of the new ECG-transmission
        flow-chart were 83.3% and 88.1%, respectively. The median DTBT was
        significantly shortened (p=0.045) and the prevalence of DTBT<90min was
        significantly higher in the ECGT-FC group (p=0.018) than the other
        groups. The sensitivity and specificity of the new flow-chart for
        ECG-transmission were high. The new flow-chart combined with an
        ECG-transmission device could detect STEMI efficiently and shorten DTBT.
      - >-
        Multiple strains of the SARS-CoV-2 have arisen and jointly influence the
        trajectory of the coronavirus disease (COVID-19) pandemic. However,
        current models rarely account for this multi-strain dynamics and their
        different transmission rate and response to vaccines. We propose a new
        mathematical model that accounts for two virus variants and the
        deployment of a vaccination program. To demonstrate utility, we applied
        the model to determine the control reproduction number 
      - >-
        The co-occurrence of infectious diseases (ID) and non-communicable
        diseases (NCD) is widespread, presenting health service delivery
        challenges especially in low-and middle-income countries (LMICs).
        Integrated health care is a possible solution but may require a paradigm
        shift to be successfully implemented. This literature review identifies
        integrated care examples among selected ID and NCD dyads. We searched
        PubMed, PsycINFO, Cochrane Library, CINAHL, Web of Science, EMBASE,
        Global Health Database, and selected clinical trials registries.
        Eligible studies were published between 2010 and December 2022,
        available in English, and report health service delivery programs or
        policies for the selected disease dyads in LMICs. We identified 111
        studies that met the inclusion criteria, including 56 on tuberculosis
        and diabetes integration, 46 on health system adaptations to treat
        COVID-19 and cardiometabolic diseases, and 9 on COVID-19, diabetes, and
        tuberculosis screening. Prior to the COVID-19 pandemic, most studies on
        diabetes-tuberculosis integration focused on clinical service delivery
        screening. By far the most reported health system outcomes across all
        studies related to health service delivery (n = 72), and 19 addressed
        health workforce. Outcomes related to health information systems (n =
        5), leadership and governance (n = 3), health financing (n = 2), and
        essential medicines (n = 4)) were sparse. Telemedicine service delivery
        was the most common adaptation described in studies on COVID-19 and
        either cardiometabolic diseases or diabetes and tuberculosis. ID-NCD
        integration is being explored by health systems to deal with
        increasingly complex health needs, including comorbidities. High excess
        mortality from COVID-19 associated with NCD-related comorbidity prompted
        calls for more integrated ID-NCD surveillance and solutions. Evidence of
        clinical integration of health service delivery and workforce has
        grown-especially for HIV and NCDs-but other health system building
        blocks, particularly access to essential medicines, health financing,
        and leadership and governance, remain in disease silos.
  - source_sentence: >-
      Foot-and-mouth disease virus 3C(pro) inhibits interferon-/ response and
      expression of IFN-stimulated genes
    sentences:
      - >-
        Repeated bottleneck passages of RNA viruses result in accumulation of
        mutations and fitness decrease. Here, we show that clones of
        foot-and-mouth disease virus (FMDV) subjected to bottleneck passages, in
        the form of plaque-to-plaque transfers in BHK-21 cells, increased the
        thermosensitivity of the viral clones. By constructing infectious FMDV
        clones, we have identified the amino acid substitution M54I in capsid
        protein VP1 as one of the lesions associated with thermosensitivity.
        M54I affects processing of precursor P1, as evidenced by decreased
        production of VP1 and accumulation of VP1 precursor proteins. The defect
        is enhanced at high temperatures. Residue M54 of VP1 is exposed on the
        virion surface, and it is close to the B-C loop where an antigenic site
        of FMDV is located. M54 is not in direct contact with the VP1-VP3
        cleavage site, according to the three-dimensional structure of FMDV
        particles. Models to account for the effect of M54 in processing of the
        FMDV polyprotein are proposed. In addition to revealing a distance
        effect in polyprotein processing, these results underline the importance
        of pursuing at the biochemical level the biological defects that arise
        when viruses are subjected to multiple bottleneck events.
      - >-
        To improve the delivery of liposomes to tumors using P-selectin
        glycoprotein ligand 1 (PSGL1) mediated binding to selectin molecules,
        which are upregulated on tumorassociated endothelium.
      - >-
        Foot-and-mouth disease is a highly contagious viral illness of wild and
        domestic cloven-hoofed animals. The causative agent, foot-and-mouth
        disease virus (FMDV), replicates rapidly, efficiently disseminating
        within the infected host and being passed on to susceptible animals via
        direct contact or the aerosol route. To survive in the host, FMDV has
        evolved to block the host interferon (IFN) response. Previously, we and
        others demonstrated that the leader proteinase (L(pro)) of FMDV is an
        IFN antagonist. Here, we report that another FMDV-encoded proteinase,
        3C(pro), also inhibits IFN-α/β response and the expression of
        IFN-stimulated genes. Acting in a proteasome- and caspase-independent
        manner, the 3C(pro) of FMDV proteolytically cleaved nuclear
        transcription factor kappa B (NF-κB) essential modulator (NEMO), a
        bridging adaptor protein essential for activating both NF-κB and
        interferon-regulatory factor signaling pathways. 3C(pro) specifically
        targeted NEMO at the Gln 383 residue, cleaving off the C-terminal zinc
        finger domain from the protein. This cleavage impaired the ability of
        NEMO to activate downstream IFN production and to act as a signaling
        adaptor of the RIG-I/MDA5 pathway. Mutations specifically disrupting the
        cysteine protease activity of 3C(pro) abrogated NEMO cleavage and the
        inhibition of IFN induction. Collectively, our data identify NEMO as a
        substrate for FMDV 3C(pro) and reveal a novel mechanism evolved by a
        picornavirus to counteract innate immune signaling.
  - source_sentence: Measuring flourishing among adolescent and adult populations
    sentences:
      - >-
        Flourishing is an evolving wellbeing construct and outcome of interest
        across the social and biological sciences. Despite some conceptual
        advancements, there remains limited consensus on how to measure
        flourishing, as well as how to distinguish it from closely related
        wellbeing constructs, such as thriving and life satisfaction. This paper
        aims to provide an overview and comparison of the diverse scales that
        have been developed to measure flourishing among adolescent and adult
        populations to provide recommendations for future studies seeking to use
        flourishing as an outcome in social and biological research.
      - >-
        Although well-being at work is important for occupational health,
        multi-dimensional workplace well-being measures do not exist for
        Japanese workers. The purpose of this study was to investigate the
        validity of the Japanese version of the Workplace PERMA-Profiler.
        Japanese workers completed online surveys at baseline (N = 310) and 1
        month later (N = 100). The Workplace PERMA-Profiler was translated
        according to international guidelines. Job and life satisfaction, work
        engagement, psychological distress, work-related psychosocial factors,
        and work performance were measured as comparisons for convergent
        validity. Cronbach's alphas, Intra-class Correlation Coefficients
        (ICCs), and measurement errors were calculated for the reliability, and
        the validity of the measure was tested by correlational analyses and
        confirmatory factor analysis. A total of 310 (baseline) and 86
        (follow-up) workers responded and were included in the analyses.
        Cronbach's alphas and ICCs of the Japanese Workplace PERMA-Profiler
        ranged from 0.75 to 0.96. Confirmatory factor analysis indicated that
        the 5-factor model demonstrated a marginally acceptable fit (χ2 (80) =
        351.30, CFI = 0.892, TLI = 0.858, RMSEA = 0.105, SRMR = 0.051). Overall
        well-being and the five PERMA domains had moderate-to-strong
        correlations with job satisfaction, psychological distress (inversely),
        and work-related factors. The Japanese version of the Workplace
        PERMA-Profiler demonstrated adequate reliability and validity. This
        measure could be useful to assess well-being at work, promote well-being
        research among Japanese workers, and address the problem of definition
        for well-being in further studies.
      - >-
        We experience countless pieces of new information each day, but
        remembering them later depends on firmly instilling memory storage in
        the brain. Numerous studies have implicated non-rapid eye movement
        (NREM) sleep in consolidating memories via interactions between
        hippocampus and cortex. However, the temporal dynamics of this
        hippocampal-cortical communication and the concomitant neural
        oscillations during memory reactivations remains unclear. To address
        this issue, the present study used the procedure of targeted memory
        reactivation (TMR) following learning of object-location associations to
        selectively reactivate memories during human NREM sleep. Cortical
        pattern reactivation and hippocampal-cortical coupling were measured
        with intracranial EEG recordings in patients with epilepsy. We found
        that TMR produced variable amounts of memory enhancement across a set of
        object-location associations. Successful TMR increased hippocampal
        ripples and cortical spindles, apparent during two discrete sweeps of
        reactivation. The first reactivation sweep was accompanied by increased
        hippocampal-cortical communication and hippocampal ripple events coupled
        to local cortical activity (cortical ripples and high-frequency
        broadband activity). In contrast, hippocampal-cortical coupling
        decreased during the second sweep, while increased cortical spindle
        activity indicated continued cortical processing to achieve long-term
        storage. Taken together, our findings show how dynamic patterns of
        item-level reactivation and hippocampal-cortical communication support
        memory enhancement during NREM sleep.
  - source_sentence: >-
      Agrobacterium tumefaciens Hfq binds to sRNA AbcR1 and its target mRNA
      atu2422
    sentences:
      - >-
        Amyloid β (Aβ) assemblies exist not only in the central nervous system,
        but can circulate within the bloodstream to trigger and exacerbate
        peripheral, cerebrovascular, and neurodegenerative disorders.
        Eliminating excess peripheral Aβ fibrils, therefore, holds promise to
        improve the management of amyloid-related diseases. Here, we present
        nanoemulsion-mediated ultrasonic ablation of circulating Aβ fibrils to
        both destroy established plaques and prevent the re-growth of ablated
        fragments back into toxic species. This approach is made possible using
        a de novo designed peptide emulsifier that contains the self-associating
        sequence from the amyloid precursor protein. Emulsification of the
        peptide surfactant with fluorous nanodroplets produces contrast agents
        that rapidly adsorb Aβ assemblies and allows their ultrasound-controlled
        destruction via acoustic cavitation. Vessel-mimetic flow experiments
        demonstrate that nanoemulsion-assisted Aβ disruption can be achieved in
        circulation using clinical diagnostic ultrasound transducers. Additional
        cell-based assays confirm the ablated fragments are less toxic to
        neuronal and glial cells compared to mature fibrils, and can be rapidly
        phagocytosed by both peripheral and brain macrophages. These results
        highlight the potential of nanoemulsion contrast agents to deliver new
        imaging enabled strategies for non-invasive management of Aβ-related
        diseases using traditional diagnostic ultrasound modalities.
      - >-
        The Hfq protein mediates gene regulation by small RNAs (sRNAs) in about
        50% of all bacteria. Depending on the species, phenotypic defects of an
        hfq mutant range from mild to severe. Here, we document that the
        purified Hfq protein of the plant pathogen and natural genetic engineer
        Agrobacterium tumefaciens binds to the previously described sRNA AbcR1
        and its target mRNA atu2422, which codes for the substrate binding
        protein of an ABC transporter taking up proline and γ-aminobutyric acid
        (GABA). Several other ABC transporter components were overproduced in an
        hfq mutant compared to their levels in the parental strain, suggesting
        that Hfq plays a major role in controlling the uptake systems and
        metabolic versatility of A. tumefaciens. The hfq mutant showed delayed
        growth, altered cell morphology, and reduced motility. Although the
        DNA-transferring type IV secretion system was produced, tumor formation
        by the mutant strain was attenuated, demonstrating an important
        contribution of Hfq to plant transformation by A. tumefaciens.
      - >-
        Hfq is an RNA-binding protein that functions in post-transcriptional
        gene regulation by mediating interactions between mRNAs and small
        regulatory RNAs (sRNAs). Two proteins encoded by BAB1_1794 and BAB2_0612
        are highly over-produced in a Brucella abortus hfq mutant compared with
        the parental strain, and recently, expression of orthologues of these
        proteins in Agrobacterium tumefaciens was shown to be regulated by two
        sRNAs, called AbcR1 and AbcR2. Orthologous sRNAs (likewise designated
        AbcR1 and AbcR2) have been identified in B. abortus 2308. In Brucella,
        abcR1 and abcR2 single mutants are not defective in their ability to
        survive in cultured murine macrophages, but an abcR1 abcR2 double mutant
        exhibits significant attenuation in macrophages. Additionally, the abcR1
        abcR2 double mutant displays significant attenuation in a mouse model of
        chronic Brucella infection. Quantitative proteomics and microarray
        analyses revealed that the AbcR sRNAs predominantly regulate genes
        predicted to be involved in amino acid and polyamine transport and
        metabolism, and Northern blot analyses indicate that the AbcR sRNAs
        accelerate the degradation of the target mRNAs. In an Escherichia coli
        two-plasmid reporter system, overexpression of either AbcR1 or AbcR2 was
        sufficient for regulation of target mRNAs, indicating that the AbcR
        sRNAs from B. abortus 2308 perform redundant regulatory functions.
  - source_sentence: >-
      Neural correlates of advice evaluation and integration in the
      judge-advisor paradigm
    sentences:
      - >-
        Considering advice from others is a pervasive element of human social
        life. We used the judge-advisor paradigm to investigate the neural
        correlates of advice evaluation and advice integration by means of
        functional magnetic resonance imaging. Our results demonstrate that
        evaluating advice recruits the "mentalizing network," brain regions
        activated when people think about others' mental states. Important
        activation differences exist, however, depending upon the perceived
        competence of the advisor. Consistently, additional analyses demonstrate
        that integrating others' advice, i.e., how much participants actually
        adjust their initial estimate, correlates with neural activity in the
        centromedial amygdala in the case of a competent and with activity in
        visual cortex in the case of an incompetent advisor. Taken together, our
        findings, therefore, demonstrate that advice evaluation and integration
        rely on dissociable neural mechanisms and that significant differences
        exist depending upon the advisor's reputation, which suggests different
        modes of processing advice depending upon the perceived competence of
        the advisor.
      - >-
        The role of antibodies in kidney transplant (KT) has evolved
        significantly over the past few decades. This role of antibodies in KT
        is multifaceted, encompassing both the challenges they pose in terms of
        antibody-mediated rejection (AMR) and the opportunities for improving
        transplant outcomes through better detection, prevention, and treatment
        strategies. As our understanding of the immunological mechanisms
        continues to evolve, so too will the approaches to managing and
        harnessing the power of antibodies in KT, ultimately leading to improved
        patient and graft survival. This narrative review explores the
        multifaceted roles of antibodies in KT, including their involvement in
        rejection mechanisms, advancements in desensitization protocols, AMR
        treatments, and their potential role in monitoring and improving graft
        survival.
      - >-
        Humans regulate intergroup conflict through parochial altruism; they
        self-sacrifice to contribute to in-group welfare and to aggress against
        competing out-groups. Parochial altruism has distinct survival
        functions, and the brain may have evolved to sustain and promote
        in-group cohesion and effectiveness and to ward off threatening
        out-groups. Here, we have linked oxytocin, a neuropeptide produced in
        the hypothalamus, to the regulation of intergroup conflict. In three
        experiments using double-blind placebo-controlled designs, male
        participants self-administered oxytocin or placebo and made decisions
        with financial consequences to themselves, their in-group, and a
        competing out-group. Results showed that oxytocin drives a "tend and
        defend" response in that it promoted in-group trust and cooperation, and
        defensive, but not offensive, aggression toward competing out-groups.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
language:
  - en

SentenceTransformer based on thenlper/gte-small

This is a sentence-transformers model finetuned from thenlper/gte-small. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: thenlper/gte-small
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Neural correlates of advice evaluation and integration in the judge-advisor paradigm',
    'Considering advice from others is a pervasive element of human social life. We used the judge-advisor paradigm to investigate the neural correlates of advice evaluation and advice integration by means of functional magnetic resonance imaging. Our results demonstrate that evaluating advice recruits the "mentalizing network," brain regions activated when people think about others\' mental states. Important activation differences exist, however, depending upon the perceived competence of the advisor. Consistently, additional analyses demonstrate that integrating others\' advice, i.e., how much participants actually adjust their initial estimate, correlates with neural activity in the centromedial amygdala in the case of a competent and with activity in visual cortex in the case of an incompetent advisor. Taken together, our findings, therefore, demonstrate that advice evaluation and integration rely on dissociable neural mechanisms and that significant differences exist depending upon the advisor\'s reputation, which suggests different modes of processing advice depending upon the perceived competence of the advisor.',
    'Humans regulate intergroup conflict through parochial altruism; they self-sacrifice to contribute to in-group welfare and to aggress against competing out-groups. Parochial altruism has distinct survival functions, and the brain may have evolved to sustain and promote in-group cohesion and effectiveness and to ward off threatening out-groups. Here, we have linked oxytocin, a neuropeptide produced in the hypothalamus, to the regulation of intergroup conflict. In three experiments using double-blind placebo-controlled designs, male participants self-administered oxytocin or placebo and made decisions with financial consequences to themselves, their in-group, and a competing out-group. Results showed that oxytocin drives a "tend and defend" response in that it promoted in-group trust and cooperation, and defensive, but not offensive, aggression toward competing out-groups.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9575, 0.8147],
#         [0.9575, 1.0000, 0.8303],
#         [0.8147, 0.8303, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 129,971 training samples
  • Columns: sentence_0, sentence_1, and sentence_2
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 sentence_2
    type string string string
    details
    • min: 6 tokens
    • mean: 19.55 tokens
    • max: 56 tokens
    • min: 3 tokens
    • mean: 210.7 tokens
    • max: 512 tokens
    • min: 24 tokens
    • mean: 312.31 tokens
    • max: 512 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2
    Microbiology and immunomics in male infertility Up to 50% of infertility is caused by the male side. Varicocele, orchitis, prostatitis, oligospermia, asthenospermia, and azoospermia are common causes of impaired male reproductive function and male infertility. In recent years, more and more studies have shown that microorganisms play an increasingly important role in the occurrence of these diseases. This review will discuss the microbiological changes associated with male infertility from the perspective of etiology, and how microorganisms affect the normal function of the male reproductive system through immune mechanisms. Linking male infertility with microbiome and immunomics can help us recognize the immune response under different disease states, providing more targeted immune target therapy for these diseases, and even the possibility of combined immunotherapy and microbial therapy for male infertility. There are currently no sensitive and specific assays for activin B that could be utilized to study human biological fluids. The aim of this project was to develop and validate a 'total' activin B ELISA for use with human biological fluids and establish concentrations of activin B in the circulation and fluids from the reproductive organs. The new ELISA was validated and then used to measure activin B levels in the circulation of healthy participants, IVF patients, pregnant women and in ovarian follicular fluid and seminal plasma. Healthy adult subjects (n = 143), subjects from an IVF clinic (n = 27) and pregnancy groups (n = 29) were sampled. The sensitivity of the assay was 0.019 ng/ml. Validation of the activin B ELISA showed good recovery (90.7 +/- 9.8%) and linearity in biological fluid and cell culture media and low cross-reactivity with related analytes (inhibin B = 0.077% and activin A = 0.0034%). There was a negative correlation between activin B concentration (r = -0.281, P < ...
    Biomarkers of heterogeneity in type 1 diabetes The 'Biomarkers of heterogeneity in type 1 diabetes' study cohort was set up to identify genetic, physiological and psychosocial factors explaining the observed heterogeneity in disease progression and the development of complications in people with long-standing type 1 diabetes (T1D). In patients with type 1 diabetes, there has been concern about the effects of recurrent hypoglycaemia and chronic hyperglycaemia on cognitive function. Because other biomedical factors may also increase the risk of cognitive decline, this study examined whether macrovascular risk factors (hypertension, smoking, hypercholesterolaemia, obesity), sub-clinical macrovascular disease (carotid intima-media thickening, coronary calcification) and microvascular complications (retinopathy, nephropathy) were associated with decrements in cognitive function over an extended time period. Type 1 diabetes patients (n = 1,144) who had completed a comprehensive cognitive test battery at entry into the Diabetes Control and Complications Trial were re-assessed at a mean of 18.5 (range: 15-23) years later. Univariate and multivariable models examined the relationship between cognitive change and the presence of micro- and macrovascular complications and risk factors. Univariate modelling showed that smoki...
    Role of Molecular Profiling and Subgroups in Pediatric Medulloblastoma As advances in the molecular and genetic profiling of pediatric medulloblastoma evolve, associations with prognosis and treatment are found (prognostic and predictive biomarkers) and research is directed at molecular therapies. Medulloblastoma typically affects young patients, where the implications of any treatment on the developing brain must be carefully considered. The aim of this article is to provide a clear comprehensible update on the role molecular profiling and subgroups in pediatric medulloblastoma as it is likely to contribute significantly toward prognostication. Knowledge of this classification is of particular interest because there are new molecular therapies targeting the Shh subgroup of medulloblastomas. The Wnt/beta-catenin pathway plays important roles during embryonic development and growth control. The B56 regulatory subunit of protein phosphatase 2A (PP2A) has been implicated as a regulator of this pathway. However, this has not been investigated by loss-of-function analyses. Here we report loss-of-function analysis of PP2A:B56epsilon during early Xenopus embryogenesis. We provide direct evidence that PP2A:B56epsilon is required for Wnt/beta-catenin signaling upstream of Dishevelled and downstream of the Wnt ligand. We show that maternal PP2A:B56epsilon function is required for dorsal development, and PP2A:B56epsilon function is required later for the expression of the Wnt target gene engrailed, for subsequent midbrain-hindbrain boundary formation, and for closure of the neural tube. These data demonstrate a positive role for PP2A:B56epsilon in the Wnt pathway.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 1
  • max_steps: 20
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1
  • max_steps: 20
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Framework Versions

  • Python: 3.10.14
  • Sentence Transformers: 5.0.0
  • Transformers: 4.53.2
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.6.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

If our work was helpful consider citing us ☺️

@misc{sinha2025bicaeffectivebiomedicaldense,
      title={BiCA: Effective Biomedical Dense Retrieval with Citation-Aware Hard Negatives}, 
      author={Aarush Sinha and Pavan Kumar S and Roshan Balaji and Nirav Pravinbhai Bhatt},
      year={2025},
      eprint={2511.08029},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2511.08029}, 
}