AI Model Inference (preview:2024-05-01)

2025/02/12 • 4 new methods

GetChatCompletions (new)
Description Gets chat completions for the provided chat messages. Completions support a wide variety of tasks and generate text that continues from or "completes" provided prompt data. The method makes a REST API call to the `/chat/completions` route on the given endpoint.
Reference Link ¶

⚼ Request

POST:  /chat/completions
{
api-version: string ,
extra-parameters: string ,
body:
{
messages:
[
{
role: enum ,
}
,
]
,
frequency_penalty: number ,
stream: boolean ,
presence_penalty: number ,
temperature: number ,
top_p: number ,
max_tokens: integer ,
response_format:
{
type: string ,
}
,
stop:
[
string ,
]
,
tools:
[
{
type: enum ,
function:
{
name: string ,
description: string ,
parameters: object ,
}
,
}
,
]
,
tool_choice: string ,
seed: integer ,
model: string ,
modalities:
[
string ,
]
,
}
,
}

⚐ Response (200)

{
id: string ,
object: enum ,
created: integer ,
model: string ,
choices:
[
{
index: integer ,
finish_reason: enum ,
message:
{
role: enum ,
content: string ,
tool_calls:
[
{
id: string ,
type: enum ,
function:
{
name: string ,
arguments: string ,
}
,
}
,
]
,
audio:
{
id: string ,
expires_at: integer ,
data: string ,
format: enum ,
transcript: string ,
}
,
}
,
}
,
]
,
usage:
{
completion_tokens: integer ,
prompt_tokens: integer ,
total_tokens: integer ,
completion_tokens_details:
{
audio_tokens: integer ,
total_tokens: integer ,
}
,
prompt_tokens_details:
{
audio_tokens: integer ,
cached_tokens: integer ,
}
,
}
,
}

⚐ Response (default)

{
$headers:
{
x-ms-error-code: string ,
}
,
$schema:
{
error:
{
code: string ,
message: string ,
target: string ,
details:
[
string ,
]
,
innererror:
{
code: string ,
innererror: string ,
}
,
}
,
}
,
}
GetEmbeddings (new)
Description Return the embedding vectors for given text prompts. The method makes a REST API call to the `/embeddings` route on the given endpoint.
Reference Link ¶

⚼ Request

POST:  /embeddings
{
api-version: string ,
extra-parameters: string ,
body:
{
input:
[
string ,
]
,
dimensions: integer ,
encoding_format: enum ,
input_type: enum ,
model: string ,
}
,
}

⚐ Response (200)

{
id: string ,
data:
[
{
embedding:
[
number ,
]
,
index: integer ,
object: enum ,
}
,
]
,
usage:
{
prompt_tokens: integer ,
total_tokens: integer ,
}
,
object: enum ,
model: string ,
}

⚐ Response (default)

{
$headers:
{
x-ms-error-code: string ,
}
,
$schema:
{
error:
{
code: string ,
message: string ,
target: string ,
details:
[
string ,
]
,
innererror:
{
code: string ,
innererror: string ,
}
,
}
,
}
,
}
GetImageEmbeddings (new)
Description Return the embedding vectors for given images. The method makes a REST API call to the `/images/embeddings` route on the given endpoint.
Reference Link ¶

⚼ Request

POST:  /images/embeddings
{
api-version: string ,
extra-parameters: string ,
body:
{
input:
[
{
image: string ,
text: string ,
}
,
]
,
dimensions: integer ,
encoding_format: enum ,
input_type: enum ,
model: string ,
}
,
}

⚐ Response (200)

{
id: string ,
data:
[
{
embedding:
[
number ,
]
,
index: integer ,
object: enum ,
}
,
]
,
usage:
{
prompt_tokens: integer ,
total_tokens: integer ,
}
,
object: enum ,
model: string ,
}

⚐ Response (default)

{
$headers:
{
x-ms-error-code: string ,
}
,
$schema:
{
error:
{
code: string ,
message: string ,
target: string ,
details:
[
string ,
]
,
innererror:
{
code: string ,
innererror: string ,
}
,
}
,
}
,
}
GetModelInfo (new)
Description Returns information about the AI model. The method makes a REST API call to the `/info` route on the given endpoint. This method will only work when using Serverless API or Managed Compute endpoint. It will not work for GitHub Models endpoint or Azure OpenAI endpoint.
Reference Link ¶

⚼ Request

GET:  /info
{
api-version: string ,
}

⚐ Response (200)

{
model_name: string ,
model_type: enum ,
model_provider_name: string ,
}

⚐ Response (default)

{
$headers:
{
x-ms-error-code: string ,
}
,
$schema:
{
error:
{
code: string ,
message: string ,
target: string ,
details:
[
string ,
]
,
innererror:
{
code: string ,
innererror: string ,
}
,
}
,
}
,
}