Changelog
=========
All notable changes to the Molecule Benchmarks package are documented here.
The format is based on `Keep a Changelog `_, and this project adheres to `Semantic Versioning `_.
Changelog
=========
All notable changes to the Molecule Benchmarks package are documented here.
The format is based on `Keep a Changelog `_, and this project adheres to `Semantic Versioning `_.
[0.1.9] - 2025-07-04
---------------------
Changed
~~~~~~~
- Updated documentation to match current version
- Fixed version inconsistencies across project files
[0.1.8] - 2025-06-29
---------------------
Changed
~~~~~~~
- Unique fraction at 1000 benchmark metric
- Scores now match Moses benchmark scores except for scaffold similarity
[0.1.6] - 2025-06-28
---------------------
Changed
~~~~~~~
- Improved performance with multiprocessing
[0.1.2] - 2025-06-27
---------------------
Added
~~~~~
- Direct SMILES evaluation via ``Benchmarker.benchmark(generated_smiles)`` method
- Simplified API for benchmarking pre-generated SMILES lists without implementing model interface
Changed
~~~~~~~
- Enhanced documentation with examples for both direct SMILES and model-based evaluation approaches
[0.1.0] - 2025-06-27
---------------------
Added
~~~~~
- Initial release of molecule-benchmarks package
- Comprehensive benchmark suite for molecular generation models
- Support for multiple datasets (QM9, Moses, GuacaMol)
- Validity, uniqueness, novelty, and diversity metrics
- Moses benchmark metrics implementation
- FCD (Fréchet ChemNet Distance) scoring
- KL divergence scoring for molecular property distributions
- Simple MoleculeGenerationModel protocol interface
- Built-in dummy model for testing
- Comprehensive documentation and examples
- Demo script showcasing package capabilities
- Support for both CPU and GPU computation
- Multiprocessing support for efficient computation
Core Features
~~~~~~~~~~~~~
**Benchmarker Class**
- Main benchmarking interface
- Support for both direct SMILES evaluation and model-based evaluation
- Configurable sample sizes and device selection
- Comprehensive metric calculation
**Dataset Support**
- ``SmilesDataset`` class for handling molecular datasets
- Built-in loaders for QM9, Moses, and GuacaMol datasets
- Support for custom datasets from files or lists
- Automatic SMILES canonicalization and validation
**Metrics Implementation**
- Validity metrics (valid, unique, novel fractions)
- Moses benchmark metrics (SNN score, internal diversity, filter passage)
- FCD scoring using pre-trained ChemNet
- KL divergence for property distribution comparison
- Scaffold and fragment similarity analysis
**Model Interface**
- ``MoleculeGenerationModel`` protocol for easy integration
- ``DummyMoleculeGenerationModel`` for testing
- Flexible batch generation support
**Utilities**
- SMILES canonicalization with multiprocessing
- Molecular property calculation
- Statistical analysis functions
- Chemical filtering and validation
Technical Details
~~~~~~~~~~~~~~~~~
**Dependencies**
- RDKit for cheminformatics operations
- PyTorch for neural network computations (FCD)
- Pandas for data manipulation
- SciPy for statistical computations
- Requests for dataset downloads
**Performance Optimizations**
- Multiprocessing for SMILES canonicalization
- GPU support for FCD calculations
- Efficient batch processing
- Progress tracking with tqdm
**Quality Assurance**
- Comprehensive test suite
- Type hints throughout codebase
- Linting and formatting with Ruff
- Continuous integration
Upcoming Features
-----------------
Future releases may include:
- Additional benchmark datasets
- More chemical property metrics
- Support for 3D molecular representations
- Conditional generation metrics
- Web interface for benchmarking
- Integration with popular ML frameworks
Version History Summary
-----------------------
.. list-table:: Version History
:header-rows: 1
:widths: 10 15 75
* - Version
- Date
- Key Features
* - 0.1.2
- 2025-06-27
- Direct SMILES evaluation, improved documentation
* - 0.1.0
- 2025-06-27
- Initial release with full benchmark suite
Migration Guide
---------------
From 0.1.0 to 0.1.2
~~~~~~~~~~~~~~~~~~~~
The 0.1.2 release is fully backward compatible with 0.1.0. The main addition is the simplified direct SMILES evaluation:
**New in 0.1.2:**
.. code-block:: python
# Direct SMILES evaluation (new)
results = benchmarker.benchmark(generated_smiles)
**Still supported from 0.1.0:**
.. code-block:: python
# Model-based evaluation (existing)
results = benchmarker.benchmark_model(model)
No code changes are required when upgrading from 0.1.0 to 0.1.2.
Deprecation Policy
------------------
We follow semantic versioning and maintain backward compatibility within major versions:
- **Minor versions** (0.x.0): May add new features but won't break existing functionality
- **Patch versions** (0.0.x): Bug fixes and documentation improvements only
- **Major versions** (x.0.0): May include breaking changes with migration guide
Deprecated features will be marked as such for at least one minor version before removal.
Contributing to Changelog
--------------------------
When contributing to the project, please update this changelog:
1. Add entries under "Unreleased" section
2. Use the format: ``[Added/Changed/Deprecated/Removed/Fixed/Security]``
3. Include brief description of the change
4. Reference issue numbers when applicable
Example entry:
.. code-block:: text
Added
~~~~~
- New diversity metric based on molecular fingerprints (#123)
- Support for custom molecular descriptors in KL divergence calculation
Fixed
~~~~~
- Handle edge case in FCD calculation when no valid molecules generated (#124)
For detailed contribution guidelines, see the :doc:`contributing` section.