Cloud

Cloud#

There is a set of packages that alllows interaction with cloud solutions using Python. This section considers them.

Databricks SDK#

Databricks is a platform for developing data applications. It provides the python SKD.

As it provides intercation with cloud based platform you have to set up an authentification in the .databricks file. Check more on configuration the authentification in:

Few important packages:

The databricks-connect allows to connect to the facilities of the databricks cluster.
The databricks-feature-engineering API for manaing the databricks featurestore. After installing the module databricks.feature_engineering will be awailable from the environment.
The Databrick Utilities is a module available inside Databricks as dbutils that allows you to manipulate with the Databricks environment from python codeq.

For more details check:

Databricks SDK for python documentation.
Databricks SDK page in this website.

The simpliest way to set up the configuration is through an authentication token.

Create file ~/databrickscfg, that should look like this:

[DEFAULT]
host = https:////dbc-<some unique for workspace>.cloud.databricks.com
token = <here is your token>

The profile name DEFAULT is important. You can specify a different name, but this will be used by default.
The host you can copy from the browser url line (just host, without path).
The token you can get through databricks UI: settings->developer->Access tokens->Manage.

If the configuration is set up correctly, you should be able to run the following cell without any errors.

from databricks.sdk import WorkspaceClient

w = WorkspaceClient()
type(w)

databricks.sdk.WorkspaceClient

AWS#

The boto3 package implements an AWS SDK for Python. Obviously, you need an AWS subscription ot use it, but you can try some of its functionalities using the moto mocking framework.

For more details check the AWS SDK

The following cell shows the request sent to the AWS s3 service to list the buckets. The behaviour of boto3 is mocked with moto.

import moto
import boto3
from pprint import pprint

mock = moto.mock_aws()
mock.start()

s3 = boto3.client("s3", region_name="us-west-2")
pprint(s3.list_buckets())

mock.stop()

{'Buckets': [],
 'Owner': {'DisplayName': 'webfile', 'ID': 'bcaf1ffd86f41161ca5fb16fd081034f'},
 'ResponseMetadata': {'HTTPHeaders': {'content-type': 'application/xml',
                                      'x-amzn-requestid': 'cfpTl6w6Ppk4aeXACjlVu84OCADmC68LdAP4wXHowLYEBiKkljR9'},
                      'HTTPStatusCode': 200,
                      'RequestId': 'cfpTl6w6Ppk4aeXACjlVu84OCADmC68LdAP4wXHowLYEBiKkljR9',
                      'RetryAttempts': 0}}

Azure#

Azure provides a set of SDK packages, each of which covers different aspec of its API. Check the available packages in the Azure SDK Releases.

AI Model Inference#

The azure-ai-inference implements methods to interact with ai models awailable in Azure.

You can try this package with the github models playgroud. Instead of using the Azure credentials you can use your GitHub credentials for the limited access.

The following cell sends the request to the Llama model.

Note. Setup the GITHUB_TOKEN to be able to access the model.

import os
from azure.ai.inference import ChatCompletionsClient
from azure.ai.inference.models import SystemMessage, UserMessage
from azure.core.credentials import AzureKeyCredential

endpoint = "https://models.github.ai/inference"
model = "meta/Llama-4-Scout-17B-16E-Instruct"
token = os.environ["GITHUB_TOKEN"]

client = ChatCompletionsClient(
    endpoint=endpoint,
    credential=AzureKeyCredential(token),
)

response = client.complete(
    messages=[
        SystemMessage(""),
        UserMessage("What is the capital of France?"),
    ],
    temperature=0.8,
    top_p=0.1,
    max_tokens=2048,
    model=model
)

print(response.choices[0].message.content)

The capital of France is Paris.