Custom Models

You can upload custom models to a cluster with the Bring Your Own Model (BYOM) process.

To use custom models, you need a cluster configured to use OpenSearch version 2.11. By default, new clusters use version 2.11. To create a cluster, see Creating an OpenSearch Cluster.

For existing clusters configured for version 2.3, you can perform an inline upgrade to version 2.11, for more information, see Inline Upgrade for OpenSearch Clusters.

To upgrade existing clusters configured for version 1.2.3 to 2.11, use the upgrade process described in Upgrading an OpenSearch Cluster.

The BYOM process to import custom models includes the following steps:

Complete the following prerequisites:
- Configure required IAM policy.
- Configure recommended cluster settings.
- Upload a custom model to an Object Storage bucket.
Register the model.
Deploy the model.
(Optional) Test the model.

1: Prerequisites

IAM Policy

You need to create a policy to grant OCI Search with OpenSearch access to the Object Storage bucket you upload the custom model to. The following policy example includes the required permissions:

ALLOW ANY-USER to manage object-family in tenancy WHERE ALL {request.principal.type='opensearchcluster', request.resource.compartment.id='<cluster_compartment_id>'}

If you're new to policies, see Getting Started with Policies and Common Policies.

Configure Cluster Settings

Use the settings operation of the Cluster APIs to configure the recommended cluster settings for semantic search. The following example includes the recommended settings:

PUT _cluster/settings
{
  "persistent": {
    "plugins": {
      "ml_commons": {
        "only_run_on_ml_node": "false",
        "model_access_control_enabled": "true",
        "native_memory_threshold": "99",
        "rag_pipeline_feature_enabled": "true",
        "memory_feature_enabled": "true",
        "allow_registering_model_via_local_file": "true",
        "allow_registering_model_via_url": "true",
        "model_auto_redeploy.enable":"true",
        "model_auto_redeploy.lifetime_retry_times": 10
      }
    }
  }
}

Upload Model to Object Storage Bucket

To make a custom model available to register for a cluster, you need to upload the model to an Object Storage bucket in the tenancy. If you don't have an existing Object Storage bucket, you need to create the bucket. For a tutorial that walks you through how to create a bucket, see Creating a Bucket.

Next, you need to upload the custom model to the bucket, see Uploading Files to a Bucket for a tutorial that walks you through how to upload files to a bucket. For the purposes of this walkthrough, you can download any supported hugging face model to upload.

2: Register the Model

After you have uploaded a custom model to an Object Storage bucket, you need to get the URL for accessing the uploaded file and pass the URL in the register operation from the Model APIs. You can then use the Get operation of the Tasks APIs to track the completion of the register operation and get the model ID to use when you deploy the model.

To get the URL to for the uploaded model file

Open the navigation menu and click Storage. Under Object Storage & Archive Storage, click Buckets.
Click the bucket that contains the uploaded model. The bucket's Details page appears.
Click the Actions menu () next to the object name, and then select View Object Details. The Object Details dialog box appears.
The URL to access the model file is displayed in the URL Path (URI) field. Copy the URL to use in the next step when you register the custom model. You might see a warning message indicating that the current URL in the URL Path (URI) field is deprecated, with a new URL specified in the warning message. If you see this warning message, use the new URL in the warning message instead to register the custom model.

Register the custom model

Use the register operation to register the custom model. In the following example, the custom model uploaded to the Object Storage bucket is the huggingface/sentence-transformers/all-MiniLM-L12-v2 model. The values specified in model_config for this example are from the model's config file. Ensure that you're using the applicable model configuration values for the custom model you're registering.

Specify the Object Storage URL in the actions section, this is an OCI Search with OpenSearch API added to support the BYOM scenario.

POST /_plugins/_ml/models/_register
{
  "model_group_id": "Te1qPY0BxVYhYdT6TVCt",
    "name": "sentence-transformers/all-MiniLM-L12-v2",
    "version": "1.0.1",
    "description": "This is a sentence-transformers model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search.",
    "model_task_type": "TEXT_EMBEDDING",
    "model_format": "TORCH_SCRIPT",
    "model_content_size_in_bytes": 134568911,
    "model_content_hash_value": "f8012a4e6b5da1f556221a12160d080157039f077ab85a5f6b467a47247aad49",
    "model_config": {
        "model_type": "bert",
        "embedding_dimension": 384,
        "framework_type": "sentence_transformers",
        "all_config": "{\"_name_or_path\":\"microsoft/MiniLM-L12-H384-uncased\",\"attention_probs_dropout_prob\":0.1,\"gradient_checkpointing\":false,\"hidden_act\":\"gelu\",\"hidden_dropout_prob\":0.1,\"hidden_size\":384,\"initializer_range\":0.02,\"intermediate_size\":1536,\"layer_norm_eps\":1e-12,\"max_position_embeddings\":512,\"model_type\":\"bert\",\"num_attention_heads\":12,\"num_hidden_layers\":12,\"pad_token_id\":0,\"position_embedding_type\":\"absolute\",\"transformers_version\":\"4.8.2\",\"type_vocab_size\":2,\"use_cache\":true,\"vocab_size\":30522
    },
    "url_connector": {
        "protocol": "oci_sigv1",
        "parameters": {
            "auth_type": "resource_principal"
        },
        "actions": [
            {
                "method": "GET",
                "action_type": "DOWNLOAD",
                "url": "<Object_Storage_URL_Path>"
            }
        ]
    }
}

Replace <Object_Storage_URL_Path> with a valid Object Storage URL, for example:

https://<tenancy_name>.objectstorage.us-ashburn-1.oraclecloud.com/n/<tenancy_name>/b/<bucket_name>/o/sentence-transformers_all-distilroberta-v1-1.0.1-torch_script.zip

Make note of the task_id returned in the response, you can use the task_id to check the status of the operation.

For example, from the following response:

{
  "task_id": "<task_ID>",
  "status": "CREATED"
}

Track the register task and get the model ID

To check the status of the register operation, use the task_id with the Get operation of the Tasks APIs, as shown in the following example:

GET /_plugins/_ml/tasks/<task_ID>

When the register operation is complete, the status value in the response to the Get operation is COMPLETED, as shown the following example:

{
  "model_id": "<model_ID>",
  "task_type": "REGISTER_MODEL",
  "function_name": "TEXT_EMBEDDING",
  "state": "COMPLETED",
  "worker_node": [
    "3qSqVfK2RvGJv1URKfS1bw"
  ],
  "create_time": 1706829732915,
  "last_update_time": 1706829780094,
  "is_async": true
}

Make note of the model_id value returned in the response to use when you deploy the model.

3: Deploy the Model

After the register operation is completed for the model, you can deploy the model to the cluster using the deploy operation of the Model APIs, passing the model_id from the Get operation response in the previous step, as shown in the following example:

POST /_plugins/_ml/models/<model_ID>/_deploy

Make note of the task_id returned in the response, you can use the task_id to check the status of the operation.

For example, from the following response:

{
  "task_id": "<task_ID>",
  "task_type": "DEPLOY_MODEL",
  "status": "CREATED"
}

to check the status of the register operation, use the task_id with the Get operation of the Tasks APIs, as shown in the following example:

GET /_plugins/_ml/tasks/<task_ID>

When the deploy operation is complete, the status value in the response to the Get operation is COMPLETED.

4: Test the Model

After the model is successfully deployed, you can test the model by using the text_embedding endpoint, as shown in the following example:

POST /_plugins/_ml/_predict/text_embedding/ZANGOI0B9HkHUYmG79QF
{
  "text_docs":["hellow world", "new message", "this too"]
}

Alternatively, you can use the _predict endpoint, as shown in the following example:

POST /_plugins/_ml/models/ZANGOI0B9HkHUYmG79QF/_predict
{
 "parameters":{
    "passage_text": "Testing the cohere embedding model"
}
}

Oracle Cloud Infrastructure Documentation