Model Card for Spotify Song Description Model

Model Details

Model Description

This model is a fine-tuned version of facebook/opt-125m, designed to generate natural language descriptions of a song’s mood, style, and characteristics based on metadata such as song title, artist, genre, release date, key, tempo, loudness, explicitness, and emotion. It was trained on a 5,000-song subset of a 200k Spotify songs dataset, using supervised fine-tuning (SFT) with the trl library. The model uses an Alpaca-style prompt template to produce coherent, context-aware descriptions for music-related applications.

Developed by: Ayush Singh (GitHub: Ayushsingh1009, Hugging Face: Ayushsingh1009)
Funded by [optional]: [Not specified; add if applicable]
Shared by [optional]: Ayush Singh
Model type: Causal Language Model (Text Generation)
Language(s) ( rythmic): English
License: Apache 2.0 (inherited from facebook/opt-125m; confirm if modified)
Finetuned from model: facebook/opt-125m

Model Sources

Repository: https://huggingface.co/Ayushsingh1009/Spotify-song (update if repository name changes)
Paper [optional]: None
Demo [optional]: [Add if a demo is created, e.g., a Colab notebook link]

Uses

Direct Use

The model can be used to generate descriptive text about songs based on metadata input, suitable for music recommendation systems, playlist curation, or music analysis tools. Users provide metadata (e.g., song title, artist, genre, tempo, emotion), and the model outputs a description of the song’s mood, style, and characteristics.

Example input: Song: Dancing Queen Artist: ABBA Genre: pop Release Date: 1976 Key: A Maj Tempo: 100 BPM Loudness: -6.4 dB Explicit: No Emotion: joy

Example output:

The song 'Dancing Queen' by ABBA is a pop track that evokes a feeling of joy. With a moderate and steady tempo of 100 BPM, it creates a joyful atmosphere. Composed in A Maj, the track delivers high energy (80/100) with a powerful sound. The song is highly danceable (75/100) and carries a positive vibe (85/100). Released in 1976, it captures the classic sound of the 1970s.

Downstream Use

Integration into music streaming platforms to enhance user-facing song descriptions.
Use in educational tools to teach music theory by linking metadata (e.g., tempo, key) to mood and style.
Fine-tuning for specific genres or languages to generate more tailored descriptions.

Out-of-Scope Use

Generating factual data beyond metadata (e.g., lyrics or artist biographies).
Applications requiring real-time audio analysis (the model uses metadata, not audio).
Use in non-English contexts without further fine-tuning, as the training data is primarily English.

Bias, Risks, and Limitations

Bias: The model may reflect biases in the Spotify dataset, such as overrepresentation of popular genres (e.g., pop, hip-hop) or underrepresentation of niche or non-Western music.
Limitations:
- Trained on a 5,000-song subset, which may limit generalization across all music genres and eras.
- Relies on metadata accuracy; missing or incorrect metadata (e.g., energy, danceability) may lead to inaccurate descriptions.
- May generate estimated values for missing fields (e.g., energy, positiveness), which could introduce errors.
- Limited to English descriptions due to training data.
Risks: Misinterpretation of emotional or stylistic descriptions could affect user experience in recommendation systems. Incorrect metadata could lead to misleading outputs.

Recommendations

Users should verify input metadata for accuracy before generating descriptions.
Developers should disclose that outputs are based on metadata and may include estimates for missing fields.
Consider further fine-tuning on diverse datasets to improve coverage of underrepresented genres or languages.
Evaluate outputs for specific use cases to ensure alignment with user expectations.

How to Get Started with the Model

Use the following code to load and use the model:

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load model and tokenizer
model_name = "Ayushsingh1009/Spotify-song"  # Update with correct repository name
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Define Alpaca-style prompt template
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
Based on the song metadata provided, describe the mood, style, and characteristics of the track.

### Input:
{}

### Response:
{}"""

Downloads last month: 9

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for Ayushsingh1009/Spotify-song

Base model

nvidia/Llama-3_1-Nemotron-Ultra-253B-v1

Finetuned

(7)

this model