GraphQL Tutorial - How to Interact w/ Harness API using Python

All this author’s posts

Leveraging Python and the GQL module enables efficient programmatic interaction with Harness GraphQL, facilitating tasks like generating CSV reports for deployed instances by service and environment, and managing paginated user data retrieval, enhancing operational visibility and reporting capabilities.

We are constantly adding new Entities to our GraphQL, and this makes the programmatic interaction with Harness something pretty interesting.

Naturally, we are starting to see multiple use cases that interact with Harness GraphQL programmatically, to get answers to questions like: “How many instances do I have for each Service? Ok, what about each Service by Environment?”

I’m addicted to Shell Script. It’s simple and powerful. But, depending on the complexity and how you manipulate your GraphQL resultset, it can become a nightmare to read the code (and support it). So, I spent some time finding a way to comprehend the HTTP aspect of this interaction automatically, at the same time that I could easily manipulate dictionary/JSON collections.

Of course, the answer was Python. I’ll explore this finding with you, while I introduce the GQL Module. I’ll not enforce pep-8, exception handling, etc. This blog post is just to give you a good introduction to this approach.

Requirements

I’ll explore this in the tutorial. But, in case you want to test my project directly, we’ll need some Environment Variables set at the runtime.

Tutorial

First Step

What are the required imports? For this example, I’ve imported these:

import os
import csv
import logging

from gql import Client, gql
from gql.transport.requests import RequestsHTTPTransport

You can solve all dependencies by running:

python3 -m pip install -r https://raw.githubusercontent.com/gabrielcerioni/harness_instanceStats_gql_to_csv/main/requirements.txt

Second Step

Let’s define a simple logger and a “constant” that I’ll get from Environment Variables:

logging.basicConfig(format='%(asctime)s - %(levelname)s - %(message)s', level=logging.INFO)

API_KEY = os.environ.get('HARNESS_GRAPHQL_API_KEY')
API_ENDPOINT = os.environ.get('HARNESS_GRAPHQL_ENDPOINT')
OUTPUT_CSV_NAME_CONST = os.environ.get('HARNESS_GQL_CSV_NAME')

Third Step

I don’t want to make this code huge, so I’ll cut to the chase. Let’s define a good generic query function to deal with Harness GraphQL:

def generic_graphql_query(query):
req_headers = {
'x-api-key': API_KEY
}

_transport = RequestsHTTPTransport(
url=API_ENDPOINT,
headers=req_headers,
use_json=True,
)

# Create a GraphQL client using the defined transport
client = Client(transport=_transport, fetch_schema_from_transport=True)

# Provide a GraphQL query
generic_query = gql(query)

# Execute the query on the transport
result = client.execute(generic_query)
return result

We can also do something very similar to run mutations:

def generic_graphql_mutation(mutation_query, params):
req_headers = {
'x-api-key': API_KEY
}

_transport = RequestsHTTPTransport(
url=API_ENDPOINT,
headers=req_headers,
use_json=True,
)

# Create a GraphQL client using the defined transport
client = Client(transport=_transport, fetch_schema_from_transport=True)

# Provide a GraphQL query
generic_query = gql(mutation_query)

# Execute the query on the transport
result = client.execute(generic_query, variable_values=params)
return result

Real Use Cases

Let’s explore two real use cases from our customers.

Use Case 1

Customer Statement: I need to generate (on demand) a CSV Report for all Instances Deployed, by Service and Environment. I also need the Service ID to make sure I’m counting this right.

Resulting Project (I’m still enhancing this): GitHub - gabrielcerioni/harness_instanceStats_gql_to_csv: This is a simple Python that will parse instanceStats GraphQL Query into a CSV.

With everything we did until this point, I also need two functions:

One to retrieve a simple instanceStats result set;
Another one to get that and put everything on a UTF-8 CSV.

This is pretty much it:

defget_all_instances_by_service_by_env():
query = '''{
instanceStats(groupBy: [{entityAggregation: Service}, {entityAggregation: Environment}]) {
... on StackedData {
dataPoints {
key {
name
id
}
values {
key {
name
}
value
}
}

}
}
}'''
generic_query_result = generic_graphql_query(query)

return(generic_query_result)

defparse_result_to_csv(instanceStats_gql_resultset):
# just for readability - I'll build a cleaner result set to make it easier to CSV this later
clean_dict_list = []

result_list = instanceStats_gql_resultset['instanceStats']['dataPoints']

for service_item in result_list:
instances = []
service_name = service_item['key']['name']
service_id = service_item['key']['id']

instance_environments = service_item['values']

for service_instance in instance_environments:
current_dict_entry = {'Service_Name' : service_name, 'Service_ID': service_id, 'Environment' : service_instance['key']['name'], 'Instance_Count' : service_instance['value']}
clean_dict_list.append(current_dict_entry)

with open(OUTPUT_CSV_NAME_CONST, 'w', encoding='utf8', newline='') as output_file:
fc = csv.DictWriter(output_file, fieldnames=clean_dict_list[0].keys(),)
fc.writeheader()
fc.writerows(clean_dict_list)

return(clean_dict_list)

Then, we can easily coordinate everything in the “main” entry point:

if __name__ == '__main__':
logging.info("Starting the Program...")

logging.info("Retrieving your current instanceStats GraphQL Query result set...")
result_from_query = get_all_instances_by_service_by_env()
logging.info("Done!")

logging.info("Expanding all rows from the nested dict - and then putting it on the CSV: {0}".format(OUTPUT_CSV_NAME_CONST))
parsed_result_set = parse_result_to_csv(result_from_query)
logging.info("Done! Outputting the list content here:")
print(parsed_result_set)

logging.info("Program Exited! Have a nice day!")

Use Case 2

Customer Statement: I need to generate an output with ALL Users from my account. But I have a lot, and this will paginate ad infinitum. Please help and KT this ASAP!

Resulting Project (I’m still enhancing this): GitHub - gabrielcerioni/harness_graphql_labs: Gabs the CSE - Harness - GraphQL Labs - Python.

My GH Project has a dummy loader, but I don’t recommend running that! We can focus on a function that will query all the users, but that also knows how to deal with GraphQL offset/pagination:

defget_harness_account_users():
offset = 0
has_more = True
total_user_list = []

while has_more:
query = '''{
users(limit: 100, offset: ''' + str(offset) + ''') {
pageInfo {
total
limit
hasMore
offset
}
nodes {
name
}
}
}'''

generic_query_result = generic_graphql_query(query)
loop_user_list = generic_query_result["users"]["nodes"]
total_user_list.extend(loop_user_list)

#total = generic_query_result["users"]["pageInfo"]["total"]
has_more = bool(generic_query_result["users"]["pageInfo"]["hasMore"])

if has_more:
offset = offset + 100

return total_user_list

Then, we could have this very simple entry point:

if __name__ == '__main__':
logging.info("Starting the Program...")

logging.info("Getting all users from your Harness Account")
result_from_query = get_harness_account_users()
logging.info("Done! You have {0} users in your Account!".format(len(result_from_query)))
print("")
logging.info("Printing the User List on your STDOUT")

print(result_from_query)

logging.info("Program Exited!")

Outcome

Here's what we can expect to see, specifically for Use Case 1:

Hopefully this was helpful to you guys! We only went over two use cases, but there are tons more that this could help.

As always, if you have any questions, shoot me a message! I'd be happy to help.

-Gabriel

Gabriel Cerioni

All this author’s posts

Tutorial: [GraphQL] How to Interact With the Harness API Using Python
| Harness Blog

Requirements