Keyword Expander

A tool for processing and expanding image metadata tags using semantic similarity and LLM validation. This tool helps organize image collections by identifying and grouping related tags while maintaining semantic hierarchies.

Features

Extracts existing metadata tags from images
Splits compound terms based on uniqueness ratio
Generates semantic embeddings for tags using llama.cpp
Identifies semantically similar tags using FAISS
Validates tag relationships using LLM
Updates image metadata with expanded tag sets
Preserves semantic hierarchies (e.g., "metal" -> "material" but not "metal" -> "brass")
Uses only the already existing keywords in your image library metadata

Requirements

Python 3.8+
ExifTool
llama.cpp embedding server
KoboldCpp

Python dependencies:

exiftool
numpy
faiss-cpu
json-repair
koboldapi-python

Installation

Install ExifTool for your platform: https://exiftool.org/
Clone this repository
Install Python dependencies: pip install -r requirements.txt
Download a GGUF format embedding model (e.g., all-MiniLM-L6-v2) and language model (e.g., gemma-2-2b-it)
Build or obtain llama.cpp and place llamacpp-embedding and needed binaries in the KeywordExpander folder
Download Koboldcpp launch it with the language model

Usage

Basic usage:

python KeywordExpander.py /path/to/image/directory \
    --model-path /path/to/embedding-model.gguf \
    --llama-path /path/to/llama-embedding \
    --api-url http://localhost:5001

The script will:

Extract current metadata from images
Generate tag embeddings
Find similar tags
Validate relationships using LLM
Update image metadata with expanded tags

Output Files

The script generates JSON files in the target directory:

KeywordExpander_metadata.json: Raw metadata from images
KeywordExpander_expansions.json: Expansion mappings

How It Works

The script does the following:

Collects all of the keywords for images in a directory and all its subdirectories
Divides single word keywords and multiple word keywords which it calls 'compounds'. Compounds are composed of a 'modifier' as the first word and a 'base' as the last word and a possible conjuction in the middle which would be 'and' or 'or'
Any two word compounds that have a modifier as a common color get split into two single keywords, so 'blue car' and 'red car' become 'blue', 'red', and 'car'
Any remaining two word compounds get analyzed by taking the total number of unique modifier and base pairs and dividing them by the total number of terms with that modifier to give a number between 0 and 1. Example: 'wooden table', 'wooden floor', 'wooden floor', 'wooden leg' count 3 uniques and 4 total for 3 / 4 = 0.75. If this number is above a threshhold and the total number of modifier occurances meet a certain minimum then the any compounds with that modifier are split into single words. For the previous example with a setting of 0.75 we would get 'wooden', 'table', 'floor', 'leg'
The lists of singles and doubles are combined and deduplicated so that only one of any specific keyword is in the list, and the list is sent to llama.cpp's embedding engine and then they are indexed using FAISS and the similarity between all of the is checked. If any compounds or single words are similiar enough they are mapped together as a 'Keyword' and a list of 'Candidates' to be used along with that keyword any time it appears in image metadata. For instance 'dog' would be a keyword with a candidate list of 'poodle', 'canine', 'mammal', 'pet'
Each keyword and candidate list is sent to an LLM using the KoboldCpp API with a prompt asking it to validate any appropriate candidates and discard any inappropriate candidates. Qualifications for validation are that it is an exact synonym or a parent category of the keyword. In the previous case for 'dog', 'canine' and 'mammal' would be validated and 'poodle' discarded because a 'poodle' is a subtype of 'dog' -- we wouldn't want 'poodle' attached to every image where there is a dog because the dog in every image may not be a poodle
When the validations are complete the validated candidates are added to every image where the keyword is present, keeping the image's original keywords as well
In this way we hope to expand the relevant keywords to each appropriate image

Credits

Invaluable assistance provided by ocha221. After reaching out for solutions they kindly wrote a working implementation of the idea:

https://github.com/ocha221/semantic-tagging-tools

License

This project is licensed under GPLv3. See the LICENSE file for details. semantic-tagging-tools is licensed under MIT. Llama.cpp is licensed under MIT.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
KeywordExpander.py		KeywordExpander.py
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Keyword Expander

Features

Requirements

Installation

Usage

Output Files

How It Works

Credits

License

Contributing

About

Releases

Packages

Languages

License

jabberjabberjabber/KeywordExpander

Folders and files

Latest commit

History

Repository files navigation

Keyword Expander

Features

Requirements

Installation

Usage

Output Files

How It Works

Credits

License

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages