Factory API#

class bocoel.factories.AdaptorName(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

The names of the adaptors.

BIGBENCH_MC = 'BIGBENCH_MULTIPLE_CHOICE'#

Corresponds to BigBenchMultipleChoice.

BIGBENCH_QA = 'BIGBENCH_QUESTION_ANSWER'#

Corresponds to BigBenchQuestionAnswer.

SST2 = 'SST2'#

Corresponds to Sst2QuestionAnswer.

GLUE = 'GLUE'#

Corresponds to GlueAdaptor.

bocoel.factories.adaptor(name: str | AdaptorName, /, **kwargs: Any) Adaptor[source]#

Create an adaptor.

Parameters:
  • name – The name of the adaptor.

  • **kwargs – The keyword arguments to pass to the adaptor. See the documentation of the corresponding adaptor for details.

Returns:

The adaptor instance.

Raises:

ValueError – If the name is unknown.

class bocoel.factories.CorpusName(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

The names of the corpus.

COMPOSED = 'COMPOSED'#

Corresponds to ComposedCorpus.

bocoel.factories.corpus(name: str | CorpusName = CorpusName.COMPOSED, /, *, storage: Storage, embedder: Embedder, keys: Sequence[str], index_name: str | IndexName, **index_kwargs: Any) Corpus[source]#

Create a corpus.

Parameters:
  • name – The name of the corpus.

  • storage – The storage to use.

  • embedder – The embedder to use.

  • keys – The key to use for the index.

  • index_name – The name of the index backend to use.

  • **index_kwargs – The keyword arguments to pass to the index backend.

Returns:

The corpus instance.

Raises:

ValueError – If the name is unknown.

class bocoel.factories.EmbedderName(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

The names of the embedders.

SBERT = 'SBERT'#

Corresponds to SbertEmbedder.

HUGGINGFACE = 'HUGGINGFACE'#

Corresponds to HuggingfaceEmbedder.

HUGGINGFACE_ENSEMBLE = 'HUGGINGFACE_ENSEMBLE'#

Corresponds to EnsembleEmbedder concatenating HuggingfaceEmbedder.

bocoel.factories.embedder(name: str | EmbedderName, /, *, model_name: str | list[str], device: str = 'auto', batch_size: int) Embedder[source]#

Create an embedder.

Parameters:
  • name – The name of the embedder.

  • model_name – The model name to use.

  • device – The device to use.

  • batch_size – The batch size to use.

Returns:

The embedder instance.

Raises:
  • ValueError – If the name is unknown.

  • TypeError – If the model name is not a string for SBERT or Huggingface, or not a list of strings for HuggingfaceEnsemble.

class bocoel.factories.IndexName(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

The names of the indices.

FAISS = 'FAISS'#

Corresponds to FaissIndex.

HNSWLIB = 'HNSWLIB'#

Corresponds to HnswlibIndex.

POLAR = 'POLAR'#

Corresponds to PolarIndex.

WHITENING = 'WHITENING'#

Corresponds to WhiteningIndex.

bocoel.factories.index_class(name: str | IndexName, /) type[Index][source]#

Get the index class for the given name.

Parameters:

name – The name of the index.

class bocoel.factories.ClassifierName(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

The classifier names.

HUGGINGFACE_LOGITS = 'HUGGINGFACE_LOGITS'#

Corresponds to HuggingfaceLogitsLM.

HUGGINGFACE_SEQUENCE = 'HUGGINGFACE_SEQUENCE'#

Corresponds to HuggingfaceSequenceLM.

class bocoel.factories.GeneratorName(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

The generator names.

HUGGINGFACE_GENERATIVE = 'HUGGINGFACE_GENERATIVE'#

Corresponds to HuggingfaceGenerativeLM.

bocoel.factories.generative(name: str | GeneratorName, /, *, model_path: str, batch_size: int, device: str = 'auto', add_sep_token: bool = False) GenerativeModel[source]#

Create a generative model.

Parameters:
  • name – The name of the model.

  • model_path – The path to the model.

  • batch_size – The batch size to use.

  • device – The device to use.

  • add_sep_token – Whether to add the sep token.

Returns:

The generative model instance.

Raises:

ValueError – If the name is unknown.

class bocoel.factories.OptimizerName(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

The names of the optimizers.

BAYESIAN = 'BAYESIAN'#

Corresponds to AxServiceOptimizer.

KMEANS = 'KMEANS'#

Corresponds to KMeansOptimizer.

KMEDOIDS = 'KMEDOIDS'#

Corresponds to KMedoidsOptimizer.

RANDOM = 'RANDOM'#

Corresponds to RandomOptimizer.

BRUTE = 'BRUTE'#

Corresponds to BruteForceOptimizer.

UNIFORM = 'UNIFORM'#

Corresponds to UniformOptimizer.

bocoel.factories.optimizer(name: str | OptimizerName, /, *, corpus: Corpus, adaptor: Adaptor, **kwargs: Any) Optimizer[source]#

Create an optimizer instance.

Parameters:
  • name – The name of the optimizer.

  • corpus – The corpus to optimize.

  • adaptor – The adaptor to use.

  • **kwargs – Additional keyword arguments to pass to the optimizer. See the documentation for the specific optimizer for details.

Returns:

The optimizer instance.

Raises:

ValueError – If the name is unknown.

class bocoel.factories.StorageName(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

The storage names.

PANDAS = 'PANDAS'#

Corresponds to PandasStorage.

DATASETS = 'DATASETS'#

Corresponds to DatasetsStorage.

bocoel.factories.storage(storage: str | StorageName, /, *, path: str = '', name: str = '', split: str = '') Storage[source]#

Create a single storage.

Parameters:
  • storage – The name of the storage.

  • path – The path to the storage.

  • name – The name of the storage.

  • split – The split to use.

Returns:

The storage instance.

Raises:

ValueError – If the storage is unknown.