1 - Finding Collections & Streams#

In this tutorial we’ll use the API to find data streams. The functionality is very similar to exploring data using the all streams table in the plotter.

Let’s start by defining some terminology: - A point is an individual measurement, consisting of a time stamp and a value. - A stream is a time series of points. - A collection is a grouping of streams – for examples measurements streaming from a single sensor recording on multiple channels.

If you would like to learn more about any of the topics covered here, please see our API documentation documentation.

To gain access to the data needed to run this notebook, you’ll need to register for an API key at ni4ai.org.

Imports#

[1]:
import btrdb
from tabulate import tabulate

Establish Server Connection#

We always start with establishing a connection to the server using the connect function from the btrdb library. The connect function takes two optional arguments - the address of the BTrDB cluster and an API key to identify the user.

[2]:
conn = btrdb.connect()

Finding Collections#

Time series data in BTrDB is organized into collections which can be thought of as a hierarchical paths such as CALIFORNIA/SanFrancisco/91405. Within this collection/path you can put as many time series streams as you like. Listing all available collections is easy an can be done with the list_collections method from the primary database handle.

[3]:
import pandas as pd

collections = conn.list_collections('ni4ai')
collections.sort()

for i, c in enumerate(collections):
    levels = c.split('/')
    for j, l in enumerate(levels):
        if i == 0:
            pass
        elif l in collections[i-1]:
            continue
        print(j*' ','->', l)

 -> ni4ai
  -> baltimore
  -> bellingham
  -> bishop_creek
  -> longmont
  -> oakland
  -> portland
  -> richmond
  -> scottsdale
  -> silver_spring

Narrowing in on specific collections#

Alternatively, you can use a targetted search if you want to limit the results to a particular set of collections by providing the first part of the collection path.

[4]:
conn.list_collections("sunshine/PMU")
[4]:
['sunshine/PMU1',
 'sunshine/PMU2',
 'sunshine/PMU3',
 'sunshine/PMU4',
 'sunshine/PMU5',
 'sunshine/PMU6']

Finding Streams#

Streams in BTrDB are one of the most important objects you will be dealing with. Each represents a particular time series within the database and contains both metadata as well as the underlying time/value pairs.

We will look at stream objects in more detail as a future exercise but for now we will concentrate on just retrieving the stream objects.

Search By Collection#

The easiest way to find the particular streams you are looking for is to use the streams_in_collection method. In the simplest use case, you can provide the collection that contains your streams.

Note that this method returns a generator and so the examples below convert it to a list to retrieve the data.

[5]:
streams = list(conn.streams_in_collection('sunshine/PMU1'))
streams
[5]:
[<Stream collection=sunshine/PMU1 name=LSTATE>,
 <Stream collection=sunshine/PMU1 name=C1ANG>,
 <Stream collection=sunshine/PMU1 name=C3MAG>,
 <Stream collection=sunshine/PMU1 name=C2MAG>,
 <Stream collection=sunshine/PMU1 name=C1MAG>,
 <Stream collection=sunshine/PMU1 name=C3ANG>,
 <Stream collection=sunshine/PMU1 name=L3ANG>,
 <Stream collection=sunshine/PMU1 name=L2ANG>,
 <Stream collection=sunshine/PMU1 name=L3MAG>,
 <Stream collection=sunshine/PMU1 name=L1ANG>,
 <Stream collection=sunshine/PMU1 name=C2ANG>,
 <Stream collection=sunshine/PMU1 name=L1MAG>,
 <Stream collection=sunshine/PMU1 name=L2MAG>]

Displaying Metadata#

Each of these streams has its own metadata such as collection, name, uuid and so on. Let’s create a simple convenience function to display the stream metadata using the tabulate library.

[6]:
def describe_streams(streams):
    table = [["Collection", "Name", "Units", "Version", "UUID"]]
    for stream in streams:
        tags = stream.tags()
        table.append([
            stream.collection, stream.name, tags["unit"], stream.version(), stream.uuid
        ])
    return tabulate(table, headers="firstrow")

print(describe_streams(streams))
Collection     Name    Units      Version  UUID
-------------  ------  -------  ---------  ------------------------------------
sunshine/PMU1  LSTATE  mask        243640  6ffb2e7e-273c-4963-9143-b416923980b0
sunshine/PMU1  C1ANG   deg         240607  d625793b-721f-46e2-8b8c-18f882366eeb
sunshine/PMU1  C3MAG   amps        240481  fb61e4d1-3e17-48ee-bdf3-43c54b03d7c8
sunshine/PMU1  C2MAG   amps        240718  d765f128-4c00-4226-bacf-0de8ebb090b5
sunshine/PMU1  C1MAG   amps        240380  1187af71-2d54-49d4-9027-bae5d23c4bda
sunshine/PMU1  C3ANG   deg         240781  0be8a8f4-3b45-4fe3-b77c-1cbdadb92039
sunshine/PMU1  L3ANG   deg         240862  e4efd9f6-9932-49b6-9799-90815507aed0
sunshine/PMU1  L2ANG   deg         240662  886203ca-d3e8-4fca-90cc-c88dfd0283d4
sunshine/PMU1  L3MAG   volts       229263  b2936212-253e-488a-87f6-a9927042031f
sunshine/PMU1  L1ANG   deg         229265  51840b07-297a-42e5-a73a-290c0a47bddb
sunshine/PMU1  C2ANG   deg         229263  97de3802-d38d-403c-96af-d23b874b5e95
sunshine/PMU1  L1MAG   volts       229266  35bdb8dc-bf18-4523-85ca-8ebe384bd9b5
sunshine/PMU1  L2MAG   volts       229264  d4cfa9a6-e11a-4370-9eda-16e80773ce8c

Filtering on metadata#

We can also include extra parameters to streams_in_collection when searching for streams. Streams contain dictionaries for metadata called tags and annotations. Tags are generally reserved for internal use while annotations are for custom metadata.

Let’s do our search again but narrow our results to just include streams that have a unit of “amps”. Similarly we can provide a dictionary for the custom annotation data if that would help to narrow our search.

[ ]:
streams = conn.streams_in_collection('sunshine/PMU1', tags={"unit": "amps"})
print(describe_streams(streams))
[ ]: