Designs for Internet of Things (IoT) applications are expected to generate reams of data to fill the cloud with actionable information. For most IoT developers, however, streaming data to the cloud and making sense of it requires expert-level understanding of technologies for data exchange and analytics.
Without simple methods to deeply analyze data streams in real time from multiple sources, IoT developers can find themselves struggling to understand the impact of data quality issues on the overall functional performance of their applications.
To help with this, Initial State Technologies provides an analytics service that developers can use to easily gain a deep understanding of IoT device data. They can then convert it to useful information that is accessible on desktops or mobile devices.
The need for embedded analytics
For any IoT application, the product of interest to its users is the actionable information derived from data streams generated by sensor nodes attached to individuals, equipment, or structures. Among the challenges facing IoT developers, the verification of these data streams requires the ability to visualize and measure the data itself.
For this, IoT application developers can take advantage of a growing array of cloud-based services to transform raw data into the required result. Within these data processing workflows, data analytics provides a critical capability for humans to understand the nature of the data to ensure its quality and identify useful events or anomalies that cannot be anticipated in the workflow.
For developers, IoT data analytics brings an entirely new set of requirements beyond those familiar to Web developers. While the latter typically requires offline batch processing of large log files, IoT data analytics combines large data repositories ("big data") with "fast data." To be effective, IoT applications must process large streams of data in real time to react to real-world events.
Enterprise-level concepts like lambda architectures and tools like Apache Storm and Spark have emerged to provide platforms for building data analytics into workflows for large scale IoT applications. For most development organizations, however, the time and resources required to build these platforms and create their own data analytics solution can be a distraction at best. As a result, IoT developers might ignore data analytics with the expectation that the data sources are "known-good" resources and the backend code will catch any problems.
In fact, IoT application developers face a growing disconnect between the amount of available data and the need for easily deployed analytics features for debugging IoT applications and for providing useful information to users.
Initial State addresses this critical gap with a software-as-a-service (SaaS) solution that makes it simple for IoT developers to deploy real-time analytics in any application. By adding only a few lines of code, developers (or customers) can visualize IoT data as it develops using customizable dashboards comprising a set of graphical displays of specific streams or events.
To explore details of a data stream, Initial States' Waves tool forms an interactive waveform viewer that allows users to zoom into the data stream. The company's Lines tool serves as a multi-waveform analyzer. Like its bench-top predecessor, this cloud-based tool lets developers examine, compare, and measure multiple signal waveforms in real time (Figure 1).
Figure 1: The Initial State platform presents data streams in a number of formats including multi-waveform displays needed to analyze complex IoT data. (Image source: Initial State Technologies)
SaaS solution
As with any SaaS package, Initial State maintains service resources in the cloud and provides a simple applications programming interface (API) for streaming data. Developers use conventional Web GET and POST protocols to stream data to "event buckets," which provide the storage resource in the Initial State platform.
Developers can use a simple GET request to create a bucket:
Copy
https://groker.initialstate.com/api/buckets?accessKey=myAccessKey&bucketKey=myBucketKey&bucketName=myBucketName
As this request illustrates, each bucket is identified using a combination of the developer's Initial State private access key (provided by Initial State at signup) and unique bucket key (provided by the engineer at bucket creation). The company also offers finer-grained authentication levels for enterprise customers. When creating an event bucket, developers can also provide an optional bucket name that is later used to identify data sourced from that bucket in the various Initial State graphical displays.
Initial State stores data as a key:value pair. So developers can use the same sort of GET request to store multiple events (with key eventKeyN and value eventValueN, say):
Copy
https://groker.initialstate.com/api/events?accessKey=myAccessKey&bucketKey=myBucketKey&eventKey0=eventValue0&eventKey1=eventValue1&eventKey2=eventValue2
Merging IoT data is simple: Developers can combine data from separate sources by streaming the data from each source to the same bucket, identified by the globally unique combination of accessKey and bucketKey.
Developers can also create buckets and stream data using a common utility such as curl to issue POST requests. In this case, the authentication keys (accessKey and bucketKey) are passed in the request header and data key:value pairs are passed in JSON format in the POST body as follows:
Copy
https://groker.initialstate.com/api/events
Content-Type:application/json
X-IS-AccessKey:myAccessKey
X-IS-BucketKey:myBucketKey
Accept-Version:~0
[
{
"key": "temperature",
"value": "1",
"epoch": 1419876021.778477
},
{
"key": "temperature",
"value": "2",
"epoch": 1419876022.778477
}
]
As this POST request illustrates, multiple instances of data from a single sensor, temperature for example, are represented with the same key. The epoch value is the event timestamp for that sensor reading provided in standard Unix time representation. Thus, developers can easily build streams of time-based data from a single sensor or stream time-based data for multiple sensors using an appropriate key for each sensor source.
Rather than building raw http requests for each transaction, developers can take advantage of C, Java, Node.js, and Python libraries provided by Initial State or the Initial State developer community. For example, the C library uses Libcurl to make POST requests and provides two functions, create_bucket and stream_event, with parameters corresponding directly to the keys and data used in the raw GET and POST requests described previously:
int create_bucket(char *access_key, char *bucket_key, char *bucket_name);
int stream_event(char *access_key, char *bucket_key, char *json);
Analytics deployment
The Initial State SaaS analytics solution provides a ready complement to an emerging class of off-the-shelf IoT development kits. For example, the Connected Cellular BeagleBone IoT Development Kit provides a platform for rapid implementation of an IoT cellular wireless node. Along with a complete cellular package, the kit includes the BeagleBone Black board, featuring a Texas Instruments ARM® Cortex®-A8-based AM335x processor and multiple expansion interfaces including Ethernet and USB. Developers can literally plug in support for a broad array of sensing applications by connecting readily available Grove-compatible sensors to the board through the provided Grove cape board from Seeed Technology (Figure 2).
Figure 2: With the Connected Cellular BeagleBone IoT Development Kit, IoT developers can add sensor functionality simply by plugging in a Grove-compatible sensor such as the light sensor shown here. (Image source: Initial State Technologies)
Developers can plug analytics into their IoT application just as simply. After SSHing into the BeagleBone Black board, the developer can install an Initial State library such as the ISStreamer Python module. In fact, Initial State provides an installation script for the module, avoiding common problems due to incorrect installation procedures. Consequently, installation requires only running a shell script that safely downloads the script (preceding the curl command with backslash to ensure use of the proper binary) and executing the downloaded script using bash:
$ \curl -sSL https://get.initialstate.com/python -o - | sudo bash
Software engineers would create a bucket with a single line of Python code added to their Python script running on the board:
stream = Streamer(access_key="myAccessKey", bucket_key="myBucketKey")
Data values are streamed by calling the log method of the stream object. So engineers would stream data to the Initial State service by making repeated calls as appropriate to stream.log(key,value,epoch)
.
In fact, Initial State provides a sample Python application built to support the Connected Cellular BeagleBone IoT Kit. Among the software routines, a sample application demonstrates a basic design pattern for setting up the data streamer and streaming light sensor values captured from the board's analog-to-digital converter (ADC) (Listing 1). In this case, the application transmits the readings as subjective levels, which are displayed accordingly as a tile in the associated dashboard (Figure 3).
Copy
#Read from light sensor
from time import sleep
import Adafruit_BBIO.ADC as ADC
from ISStreamer.Streamer import Streamer
#Initialize the streamer
streamer = Streamer(bucket_name="BBB Readings", bucket_key="myBucketKey", access_key="myAccessKey")
pin = "AIN2"
ADC.setup()
while True:
value=ADC.read_raw(pin)
if value<=250:
level="dim"
if 251<=value<=500:
level="average"
if 501<=value<=800:
level="bright"
if value>800:
level="very bright"
print value
#Stream the value
streamer.log(":level_slider:Sensor Reading",value)
print level
#Stream the level
streamer.log(":bulb:Brightness",level)
#Sleep for 1 minute
sleep(60)
Listing 1: The Initial State sample Python application for the Connected Cellular BeagleBone IoT Kit demonstrates a simple technique for acquiring and streaming sensor data to an event bucket. (Code source: Initial State Technologies)
Figure 3: The Initial State platform lets developers create dashboards of graphical windows, or tiles, that each display analytics derived from individual IoT data streams. (Image source: Initial State Technologies)
Extensibility and scaling
Developers can extend this basic pattern to monitor and explore data streams in real time as well as set triggers to fire off a text message or email to a user. The same mechanism can also send a notification to an external service such as Amazon Web Service's Simple Queue Service (SQS). AWS SQS provides a cloud-based message queue that can be consumed by other distributed services. Using these kinds of services, developers can easily extend their Initial State deployment with custom services to create powerful, sophisticated data-processing workflows.
Conversely, developers can add analytics capabilities to an existing IoT application workflow that already pumps data into some existing external pool. For example, engineers can use cloud-based services such as AWS Lambda to periodically stream data from an existing IoT data pool.
AWS Lambda provides a serviceless mechanism for executing code snippets in response to external events, so developers can deploy analytics without further loading existing IoT application resources. With this approach, the Lambda function might be set to activate periodically and perform the basic POST request to stream data (Listing 2).
Copy
// Send data to Initial State
function sendToInitialState(accessKey, data, callback) {
const req = https.request({
hostname: 'groker.initialstate.com',
port: '443',
path: '/api/events',
method: 'POST',
headers: {
'X-IS-AccessKey': accessKey,
'X-IS-BucketKey': ISBucketKey,
'Content-Type': 'application/json',
'Accept-Version': '~0'
}
}, (res) => {
let body = '';
console.log('Status:', res.statusCode);
console.log('Headers:', JSON.stringify(res.headers));
res.setEncoding('utf8');
res.on('data', (chunk) => body += chunk);
res.on('end', () => {
console.log('Successfully processed HTTPS response');
// If we know it's JSON, parse it
if (res.headers['content-type'] === 'application/json') {
body = JSON.parse(body);
}
callback(null, body);
});
});
req.on('error', callback);
req.end(JSON.stringify(data));
}
Listing 2: Among its sample code offerings, Initial State shows how to implement data streaming for analytics with code snippets running on AWS Lambda using a Python snippet or javascript snippet (Node.js), shown here. (Code source: Initial State Technologies)
IoT developers hope to see their applications scale rapidly with growing customer acceptance, and Initial State intends for its service to scale along with the applications. Built on AWS, the service takes advantage of AWS features specifically designed to scale to support a very large number of streams, and very high data rates within those streams. Further, to ensure continued availability, event buckets are replicated across AWS availability zones (physical diversity across different data centers within a geographical region). Enterprise customers can take advantage of replication across AWS regions (global-scale diversity across data centers located in different geographical regions).
Conclusion
The ability to display raw or transformed IoT data in real time plays a critical role in helping developers understand data quality and in providing users with the core functionality of the IoT application itself. Although mechanisms for implementing IoT data analytics exist, the ability to implement suitable visualization capabilities can distract or delay a project. The Initial State Software-as-a-Service platform offers a simple approach for deploying real-time data analytics in IoT applications. Using this service, IoT developers can debug data streams from multiple sources and more easily provide real-time informational displays to end users.