Datastores API

Datastores are collections of documents that agents can search and reference. This page documents all datastore and document-related endpoints.

The Datastore Object

json
1{
2 "id": "ds_abc123def456",
3 "name": "Product Documentation",
4 "description": "All product manuals and guides",
5 "document_count": 42,
6 "total_size_bytes": 15728640,
7 "status": "ready",
8 "created_at": "2024-01-15T10:30:00Z",
9 "updated_at": "2024-01-15T10:30:00Z"
10}

Attributes

| Field | Type | Description | |-------|------|-------------| | id | string | Unique identifier for the datastore | | name | string | Display name of the datastore | | description | string | Optional description | | document_count | integer | Number of documents in the datastore | | total_size_bytes | integer | Total size of all documents | | status | string | ready, processing, or error | | created_at | string | ISO 8601 timestamp of creation | | updated_at | string | ISO 8601 timestamp of last update |


The Document Object

json
1{
2 "id": "doc_xyz789",
3 "datastore_id": "ds_abc123",
4 "name": "User Manual v2.pdf",
5 "type": "application/pdf",
6 "size_bytes": 524288,
7 "page_count": 24,
8 "status": "completed",
9 "error": null,
10 "created_at": "2024-01-15T11:00:00Z",
11 "processed_at": "2024-01-15T11:02:30Z"
12}

Attributes

| Field | Type | Description | |-------|------|-------------| | id | string | Unique identifier for the document | | datastore_id | string | ID of the parent datastore | | name | string | Original filename or display name | | type | string | MIME type of the document | | size_bytes | integer | File size in bytes | | page_count | integer | Number of pages (for PDFs) | | status | string | Processing status (see below) | | error | string | Error message if processing failed | | created_at | string | Upload timestamp | | processed_at | string | Processing completion timestamp |

Document Status

| Status | Description | |--------|-------------| | pending | Document uploaded, waiting for processing | | processing | Currently being processed | | completed | Successfully processed and indexed | | failed | Processing failed (check error field) |


List Datastores

GET/v1/datastores

Returns a list of all datastores in your workspace

Query Parameters

| Parameter | Type | Description | |-----------|------|-------------| | limit | integer | Number of datastores to return (1-100) | | cursor | string | Pagination cursor |

Example

bash
1curl https://api.orka.ai/v1/datastores \
2 -H "Authorization: Bearer sk_your_api_key"

Create Datastore

POST/v1/datastores

Creates a new datastore

Request Body

| Field | Type | Required | Description | |-------|------|----------|-------------| | name | string | Yes | Display name (1-100 characters) | | description | string | No | Optional description |

Example

bash
1curl -X POST https://api.orka.ai/v1/datastores \
2 -H "Authorization: Bearer sk_your_api_key" \
3 -H "Content-Type: application/json" \
4 -d '{
5 "name": "Legal Documents",
6 "description": "Contracts and legal agreements"
7 }'

Get Datastore

GET/v1/datastores/:id

Retrieves a specific datastore by ID

Example

bash
1curl https://api.orka.ai/v1/datastores/ds_abc123 \
2 -H "Authorization: Bearer sk_your_api_key"

Delete Datastore

DELETE/v1/datastores/:id

Permanently deletes a datastore and all its documents

Deleting a datastore will permanently delete all documents within it. This action cannot be undone.

Example

bash
1curl -X DELETE https://api.orka.ai/v1/datastores/ds_abc123 \
2 -H "Authorization: Bearer sk_your_api_key"

List Documents

GET/v1/datastores/:id/documents

Returns a list of documents in a datastore

Query Parameters

| Parameter | Type | Description | |-----------|------|-------------| | limit | integer | Number of documents to return (1-100) | | cursor | string | Pagination cursor | | status | string | Filter by status (pending, processing, completed, failed) |

Example

bash
1curl "https://api.orka.ai/v1/datastores/ds_abc123/documents?status=completed" \
2 -H "Authorization: Bearer sk_your_api_key"

Upload Document

POST/v1/documents

Uploads a new document to a datastore

Request Body (multipart/form-data)

| Field | Type | Required | Description | |-------|------|----------|-------------| | datastore_id | string | Yes | Target datastore ID | | file | file | Yes | The file to upload | | name | string | No | Custom display name |

Supported File Types

| Type | Extensions | Max Size | |------|------------|----------| | PDF | .pdf | 50 MB | | Word | .docx, .doc | 25 MB | | Text | .txt, .md | 10 MB | | HTML | .html, .htm | 10 MB |

Example with cURL

bash
1curl -X POST https://api.orka.ai/v1/documents \
2 -H "Authorization: Bearer sk_your_api_key" \
3 -F "datastore_id=ds_abc123" \
4 -F "file=@/path/to/document.pdf" \
5 -F "name=Product Manual"

Example with JavaScript

javascript
1const formData = new FormData();
2formData.append('datastore_id', 'ds_abc123');
3formData.append('file', fileInput.files[0]);
4formData.append('name', 'Product Manual');
5
6const response = await fetch('https://api.orka.ai/v1/documents', {
7 method: 'POST',
8 headers: {
9 'Authorization': `Bearer ${apiKey}`,
10 },
11 body: formData,
12});

Response

json
1{
2 "id": "doc_new456",
3 "datastore_id": "ds_abc123",
4 "name": "Product Manual",
5 "type": "application/pdf",
6 "size_bytes": 1048576,
7 "status": "pending",
8 "created_at": "2024-01-15T12:00:00Z"
9}

Documents are processed asynchronously. Poll the document endpoint or use webhooks to check when processing is complete.


Get Document

GET/v1/documents/:id

Retrieves a specific document by ID

Example

bash
1curl https://api.orka.ai/v1/documents/doc_xyz789 \
2 -H "Authorization: Bearer sk_your_api_key"

Use this endpoint to check the processing status of an uploaded document.


Delete Document

DELETE/v1/documents/:id

Permanently deletes a document

Example

bash
1curl -X DELETE https://api.orka.ai/v1/documents/doc_xyz789 \
2 -H "Authorization: Bearer sk_your_api_key"

Response

json
1{
2 "id": "doc_xyz789",
3 "deleted": true
4}

Processing Errors

When document processing fails, the status will be failed and the error field will contain details:

| Error | Description | Solution | |-------|-------------|----------| | unsupported_format | File type not supported | Use a supported file format | | file_corrupted | File is damaged or unreadable | Re-upload a valid file | | password_protected | PDF is password protected | Remove password and re-upload | | too_large | File exceeds size limit | Split into smaller files | | processing_timeout | Processing took too long | Contact support |


Update Datastore

PATCH/v1/datastores/:id

Updates an existing datastore

Path Parameters

| Parameter | Description | |-----------|-------------| | id | The datastore ID to update |

Request Body

| Field | Type | Description | |-------|------|-------------| | name | string | New display name | | description | string | New description |

Example

bash
1curl -X PATCH https://api.orka.ai/v1/datastores/ds_abc123 \
2 -H "Authorization: Bearer sk_your_api_key" \
3 -H "Content-Type: application/json" \
4 -d '{
5 "name": "Updated Name",
6 "description": "Updated description"
7 }'

Web Sources (Coming Soon)

Connect external web sources to your datastore.

Add Web Source

POST/v1/datastores/:id/web-sources

Add a web source to crawl and index

Path Parameters

| Parameter | Description | |-----------|-------------| | id | The datastore ID |

Request Body

| Field | Type | Required | Description | |-------|------|----------|-------------| | url | string | Yes | Base URL to crawl | | crawl_depth | integer | No | How many levels deep to crawl (default: 1) | | include_patterns | array | No | URL patterns to include | | exclude_patterns | array | No | URL patterns to exclude | | refresh_interval | string | No | How often to re-crawl (daily, weekly, monthly) |

Example

bash
1curl -X POST https://api.orka.ai/v1/datastores/ds_abc123/web-sources \
2 -H "Authorization: Bearer sk_your_api_key" \
3 -H "Content-Type: application/json" \
4 -d '{
5 "url": "https://docs.example.com",
6 "crawl_depth": 2,
7 "include_patterns": ["/docs/*"],
8 "refresh_interval": "weekly"
9 }'

Response

json
1{
2 "id": "ws_xyz789",
3 "datastore_id": "ds_abc123",
4 "url": "https://docs.example.com",
5 "status": "crawling",
6 "pages_found": 0,
7 "pages_indexed": 0,
8 "created_at": "2024-01-15T12:00:00Z"
9}

List Web Sources

GET/v1/datastores/:id/web-sources

List all web sources for a datastore

Delete Web Source

DELETE/v1/datastores/:id/web-sources/:source_id

Remove a web source and its indexed pages