Gold filling

What could possibly be worse than a routine trip to a dentist that culminates in the "surprise" discovery of a gargantuan dental cavity that needs to be treated? Like, urgently. Of course, there are some things that we can think of off the top of our aching heads; like if our local pharmacy announced that they will only accept payments in cryptocurrencies, or if there was a world wide shortage in quality I.P.A., otherwise known as ibuprofen, paracetamol and aspirin.

But - arguably - staggering home after the expensive appointment with a face paralysed from anaesthetics, only to find that we have a mysterious gap in our sensor data is worse. Much worse. Because - like teeth - lost sensor readings when gone, are gone forever. But unlike decaying teeth, they cannot be partially filled. Or repaired. And if these are minutely temperature measurements from that smart, Internet-connected, integrated thermometer thingummy attached to the compost bin outside, well, our entire groundbreaking research into the statistical correlation between global warming and mid-afternoon pong is in jeopardy. What a stink.

Apparently it was OK for the home broadband router to be powered off while we were strapped to the dentist's chair having our enamels carved into miniature statues of SpongeBob, because the little ones needed a wall socket for their Hello Kitty radio. But never ones to admit defeat, we're going to fightback. Because our devices should perform equally questionably, consistently dodgily, whether connected to a network or not. And we ought to be able to back-fill any data, like a true £150 an hour professional.

Orzodontics:

There's not a lot of hardware spaghetti needed for this particular recipe. In fact, we only require these off-the-shelf pasta shapes:

A Raspberry Pi running Raspbian OS. The only requirement we have to place on this thing is that it can run Python and SQLite, and that it can (every now and again if requested to do so nicely) connect to the Wild Wild Web. This is all pretty standard stuff for the humble Pi, however. And not worth the additional sentence. Yes, this additional sentence. And this one.
If we are out and about again in the wilderness, which we will be, then the only likely way in which we can connect to the aforementioned inter-webular network of people's unsolicited thoughts expressed in Tweets and VBlogs, is via a mobile phone with the occasional signal. A minute or two of the smartphone configured as a hotspot for tethering should be long enough for our strange curiosity to make contact with its mothership - AWS IoT. In short, sufficient time to dispatch back all accumulated readings since the last successful update.
In theory, we are sending back sensor readings. Or GPS coordinates. Or probably both. But we've already done a tonne of this back in I-O-Tea. And it has become a tad tedious already. In this outing, we'll simply pretend that we have some made-up temperature readings. Like a true climate change denier.

Ultimately, it doesn't matter. If whatever package we have sits snugly inside a brightly-coloured 3D-printed case, and has an innocuous-looking aerial protruding out of it, we're in business. Welcome to our family of bizarre, portable devices. Everything else (including reliability and usability) becomes secondary to the illusion of something looking smart from this point onwards.

'Um disease:

Um... Sure. So this is actually the very first blog in our brand spanking new series, um... focusing on AWS IoT usage in the wild. It's called I-O-Mr-T. And there's nothing much else to say here, really.

Um... the previous series was I-O-Tea, and after 10 posts, it... um... generally descended into controlled mayhem and anarchy. Except we did achieve a lot. And it did set the foundations for pretty much everything we do in this new blog series. Um. Um.

Um.

Your upcoming dental (dis)appointment:

Here's a very simple treatment plan, for all 21 of our decaying teeth (hopefully all covered by our dental insurance). Because like munching through a super-sized bag of Haribos before breakfast every morning, all elements need to be carefully tackled in a certain order. Enjoy!

Install SQLite on the Raspberry Pi, or more specifically, sqlite3. It's better than 1 or 2. Just because it's 3.
Play around with sqlite3 (the client) and its estranged cousin once removed, sqlite3 (the Python library). Create a database, table and number of records containing wholly doctored temperature data. Then, run some SQL queries. Why not? This is the table we'll use to store our cached sensor readings that haven't yet been shipped back to AWS IoT.
Right, let's give this a try. We'll upload our cached data to AWS IoT using its REST API HTTP endpoint. We can use the Python Requests library for this.
...Once we confirm that everything is working in isolation, we will need to write two simple applications; one to insert sensor readings into the table, and another to retrieve them and dispatch them back to AWS IoT.

/root filling:

All high jinks, jollity and merriment aside, there is actually a serious point behind all of this (sort of... stay with us). We have a peculiar attachment to going outdoors equipped with random DIY IoT paraphernalia. And these walks can be quite long (like, 40km+ long). And we keep encountering fundamental issues when we start to venture away from the nearest branch of Costa Coffee, or Starbucks. There is generally no Wi-Fi. And no toilet facilities, for that matter. Or someone nearby pretending to work for a hip start-up, who is in fact looking up filtered Instagram pictures of cats that resemble celebrities on their MacBook (like all day).

Take the 10 peaks of Brecon Beacons as an example.

We get sporadic cellular coverage (up to 4G) along the ridges and on the peaks, but rarely in-between. And since we spend a vast proportion of our time ~~lost~~ in the valleys, our sensor readings go nowhere fast. We could use LoRa, but there are no guarantees that there are LoRaWAN gateways in range, nor is it appropriate for large volumes of data, or uploads at regular intervals. All a bit disappointing, because we end up storing all our data locally on the device, and retrieving them manually once back at base. Not very smart. Not very connected.

But we do have Internet, right? Sometimes? Sure.

What we therefore need for any solution to be more interactive outdoors is a way to cache our sensor readings. Store them on the device in the first instance, but upload them in batches whenever there is temporary connectivity. Of course, this won't happen in real-time. But it's better than no-time. And we shouldn't lose any measurements. Any interested parties can (semi-)track actual readings as the journey progresses. Everyone wins, except unlucky family members who get spammed with cryptic email messages like {"temperature": 13}. Sorry, GDPR. These recipients implicitly signed up to to this service when they agreed to have us in the family.

And the most efficient way to smooth over this online / offline malarkey would be to use a local database on the device. To this end, we'll use a SQLite database running to cache our sensor readings. Then, we'll have a separate application that queries this database's table for readings that haven't been successfully uploaded yet, and uploads them using AWS IoT's HTTP endpoint whenever there is connectivity.

Family members: be aware. You are all about to receive regular spam during our trips that contain meaningless sensor readings, and if you are lucky, GPS coordinates.

You never know. It might actually be of use if we fall in to the rough sea off the Jurassic Coast.

Drill surgeon:

What would be lovely, is if we had access to a small, fast, self-contained, high-reliability, full-featured, SQL database engine. Preferably, the most used database engine in the world. Oh look. Here's one we found on the web. How very convenient.

SQLite might not be so appropriate if we need a more fully fledged database engine which can store 10 petabytes of data on the colour and height of individual blades of glass found during our trek. Or if the database is to be accessed by 10,233 botanists with an interest in the dataset. Who all inexplicably run blockchain and machine learning on it. But for the very little we want to achieve, this should be more than adequate.

Let's start by installing sqlite3 on the Raspberry Pi running RaspbianOS (if it isn't there already). Being a commonly used, lightweight relational database, sqlite3 has a habit of already being installed on the operating system.

sudo apt-get install sqlite3

Once installed, let's check the version. Why? We don't know, really. Perhaps it makes us look like we know what we're doing? Ah, no. Let's do it just to check that it's actually there.

sqlite3 -version

Wow, that enlightened us somewhat. Next, we'll create a database. Which is actually as simple as running sqlite3 with the name of the new database file, in the directory that we want the file to be created.

sqlite3 riot.db

Then, we can run a whole bunch of SQL statements which we'll describe in a second.

Oops. Here's what we just did...

We ran a CREATE TABLE... statement to guess what... create our table named sensor_data. This is our tablette that we'll use to store our sensor readings in, and track if they have been uploaded or not. Specifically, our JSON string will be stored in a payload TEXT column. Column processed simply marks if the last upload of the record to AWS IoT Core was successful or not.

The auto-incrementing id INTEGER PRIMARY KEY column is also pivotal in all of this, since this will act as the unique identifier for that particular entry.

These are probably the bare minimum columns required for the functionality. Nonetheless, we could add some more if there is a reason to.

We also ran two INSERT INTO statements to test that we can store our totally fictitious temperature readings. Note that id is the primary key and auto-increments, and processed simply defaults to 0, so we only need to supply the payload. This action is followed up by a SELECT * statement to see if the data has indeed been stored.

CREATE TABLE sensor_data (id INTEGER PRIMARY KEY, payload TEXT, processed INTEGER DEFAULT 0);
INSERT INTO sensor_data (payload) VALUES ('{"temp":10}');
INSERT INTO sensor_data (payload) VALUES ('{"temp":11}');
SELECT * FROM sensor_data;

This is all very good. But not very exciting. It's time to finish the tiny filling on that molar, and move onto the root canal. And there's no need for modern dentistry techniques, like anaesthetics. Bite from a certain, ambivalent reptile will work just fine, here.

Python has a sqlite3 library, and we can use it to interact with our SQLite database programmatically. It's apparently one degree warmer than from when we last run the INSERT INTO statement, so here is another reading to throw at this database.

import sqlite3
conn = sqlite3.connect("riot.db")
sensor_data_table = "sensor_data"
test_data = {"temp": 12}
cur = conn.cursor()
cur.execute("INSERT INTO " + sensor_data_table + " (payload) VALUES (?)", (str(test_data),)) 
conn.commit()
conn.close()

Notice how we commit(), then close() our connection afterwards. SQLite can be left in a locked state if connections aren't closed properly (even when our code fails midway).

If we query the table again using sqlite3, we see our new entry there. That worked, then.

In Herbal Essence, this is all that the application reading and storing sensor readings would have to do (instead of sending them directly to a message broker which would be the more typical action for a permanently connected device). Once it has committed its JSON payload to the database, it can go on and do something else. Like watch TV. Or make a cup of I-O-Tea.

On the other hand, a separate application would need to retrieve values from this same database, and send them to AWS IoT Core. That's fine. We can use SELECT statements to achieve this. Later, we would need to alter the statement to add a WHERE processed = 0 to make sure we only dispatch readings that haven't yet been uploaded.

import json
rows = cur.execute("SELECT payload FROM " + sensor_data_table)
payloads = []
for row in rows:
    payloads.append(json.loads(row[0].replace("'", '"')))
print(payloads)

Great! We have returned all three rows that we've inserted into the database so far, and have them in a list of dictionary objects.

Next, we need a little trick to test if the Pi has Internet connectivity. What better way to test this, than to check if there is connectivity to the specific AWS IoT HTTP endpoint that we'll be using later to upload our data.

import socket
def test_connection(url, port):
    connection = False
    try:
        host = socket.gethostbyname(url)
        _socket = socket.create_connection((host, port), 2)
        connection = True
    except:
        pass
    return connection
    
test_url = "oursecret.iot.eu-west-1.amazonaws.com"
test_url_port = 443
test_connection(test_url, test_url_port)

test_url_port = 777 # do we feel lucky, punk?
test_connection(test_url, test_url_port)

When the remote AWS IoT Core HTTP endpoint was reachable, our function returns a True. When we deliberately tried a different port on which the service isn't running, it failed.

This should provide us with sufficient assurances to confirm that there is network connectivity, and that the application should start uploading cached rows. Of course, it can still fail later on during individual uploads. But we can trap those exceptions individually as well (and make sure we don't set the processed flag column).

This is where we sort of have a choice. We could either:

Use MQTT to upload our cached reading to AWS IoT Core, using the paho.mqtt library
Use REST API (HTTP) to upload our cached reading to AWS IoT Core, using the Requests library

We are providing an example each for both ('cause we're kind like that), but we do end up settling for the latter option for reasons explained later.

We've done MQTT literally a million times before. So why not do it again. Using certificates issued to our thing in AWS IoT Thing's registry, we can use paho.mqtt.publish.single() to send our payload to topic riot/brick.

import paho.mqtt.publish as mqtt_publish
import paho.mqtt.client as mqtt
import ssl
import json

broker_address = "oursecret.iot.eu-west-1.amazonaws.com"
broker_port = 8883
client_name = "riot-brick"
topic = "riot/brick"
client = mqtt.Client(client_name)
data = {"temperature": 13}
tls = {
    "certfile": "oursecret-certificate.pem.crt.txt",
    "keyfile": "oursecret-private.pem.key",
    "ca_certs": "root.ca.pem",
    "tls_version": ssl.PROTOCOL_TLSv1_2,
}
mqtt_publish.single(
    hostname=broker_address,
    port=broker_port,
    tls=tls,
    topic=topic,
    payload=json.dumps(data),
    protocol=mqtt.MQTTv311
)

Not a lot of fuss. Nice and simple. Perhaps too casual for our liking.

Did it get there? Sure, it did.

But does the use of MQTT give us absolute assurances from the client-side that the data has been received successfully? We're not so sure.

An alternative would be to use AWS IoT's HTTP(S) endpoint to deliver our payload. Since this will be delivered through a REST API interface, it should provide us with an immediate response to state if the procedure succeeded.

Let's quickly test this out using curl to see if works.

curl --tlsv1.2 --cacert root.ca.pem --cert oursecret-certificate.pem.crt.txt  --key oursecret.pem.key -X POST -d "{ \"temperature\": 50}" "https://oursecret-ats.iot.eu-west-1.amazonaws.com:8443/topics/riot/brick"

There it is. A rather reassuring instant response confirming the transaction. This sure feels like a more reliable way of ensuring that our readings have been uploaded. At which point, we can mark our record back in the SQLite database as processed.

Wow, it's getting hot around here quite quickly. We better get the air-con on. Oh, we forgot they don't exist in houses over here in the UK. We better spend the afternoon pretending to do some shopping in the local Sainsbury's frozen food aisle.

From an AWS IoT Core standpoint, MQTT or HTTP, it appears to make no difference. The message is still received. And that means it can go on and cause all the mischief it wants in the remainder of the AWS IoT portfolio, and beyond. AWS IoT Analytics, IoT Events, Lambda, Elasticsearch Service - we're looking at you...

Lastly, we replace curl with the Python Requests module. We do pretty much the same thing as we did before, but programmatically now from within Python using requests.post().

import requests
import json
aws_iot_endpoint = "https://oursecret.iot.eu-west-1.amazonaws.com:8443/topics/riot/brick"
aws_iot_certs = ("oursecret-certificate.pem.crt.txt", "oursecret-private.pem.key")
aws_ca_certfile = "root.ca.pem"
data = {"temperature": 60}
r = requests.post(url=aws_iot_endpoint, cert=aws_iot_certs, verify=aws_ca_certfile, data=json.dumps(data))
print(r.status_code)

HTTP status code 200 is good. Very good. Except the temperature. It's time to relocate to Greenland. Global warming is really starting to take its toll.

That's it. We can bombard AWS IoT Core with our REST API POST requests using the magic of HTTP. And be reassured that they worked.

All in all, it doesn't take a stretch of our imagination to combine all this code to make two distinct applications (with only the database in common):

An application to take and store sensor readings in the table. By default, these entries will be marked as processed = 0. That's it. The REST will be done by...
A separate application to retrieve from the table all sensor readings marked as processed = 0 and upload them to AWS IoT Core using the HTTP endpoint... only when connection is available. Records successfully uploaded will be marked as processed = 1 while those that failed should remain unprocessed. Repeat every few minutes.

Here's our rudimentary version in action (running at 1 minute intervals). The exact Python code used forms part of our larger project, and can be found here on Github:

When we insert some records into the database between runs, they are eventually picked up during the uploader's next run, and posted to AWS IoT in one big batch.

Of course, there is one VERY important consideration to be made amidst all this.

Records aren't going to be arriving in AWS IoT Core in real-time, and not necessarily in the right order either. In order to reassemble this data correctly in any time-series datastore, for example for graphical analysis using Kibana or Grafana, we are going to have to include a timestamp in the JSON payload. This is not the timestamp of when the payload was sent up to the cloud. Not when it was received by the cloud. But when it was originally recorded back at source. This way, the timestamp key can be used later in the datastore (such as the one underpinning Elasticsearch) to organise the records in the correct sequence.

This data from an afternoon run wasn't uploaded until we were back at home. But thanks to the use of a timestamp attribute, the events can be reconstructed retrospectively.

It also means that this technique isn't always the right approach, especially if there are near real-time responses required. For example, we wouldn't want a sprinkler system not to come on immediately because data from a fire detector has been cached somewhere. Nor would we want it to suddenly spray the building full of water because it received the delayed payload several hours later (and after the system had in fact already been reset at the time, because it was a false alarm).

Throughout, we would also would want to monitor for the unusual absence of events, for which we could use a technique involving AWS IoT Events that we trialled in Pear Force One. Clearly, we are expecting the device to remain offline for a while... but how long before we start to become suspicious?

Oh, it's done? Already?

Our face is still numb from the anaesthetic, and we can't speak properly. But at least we can now make our escape from the dentist's chair and get ourselves a nice coffee. Interesting. A fresh coffee from Caffè Nero is normally a lot hotter when it is first served... Oh well.

Gulp. Gulp.

Floss the rest:

SQLite documentation can be found here. You should probably read it.

https://www.sqlite.org/index.html

The official Python sqlite3 library is pretty essential if you want to do anything with Python and SQLite.

https://docs.python.org/3/library/sqlite3.html

AWS IoT's HTTP REST interface is described blow. It's worth a read if MQTT has been your thing so far.

https://docs.aws.amazon.com/iot/latest/developerguide/http.html

We are using the ever-formidable Python Requests library to interact with the AWS IoT HTTP endpoint:

https://2.python-requests.org/en/master/

Rosie the Red Robot

Search This Blog

Gold filling

Orzodontics:

'Um disease:

Your upcoming dental (dis)appointment:

/root filling:

Drill surgeon:

Floss the rest:

Labels

Comments

Post a Comment

MOST VISITED (APPARENTLY)

LoRa-Wan Kenobi

Battle of BLEtain

Hard grapht