Egg detection SVM update

Following on from the last post, with the basic machine learning system in place I thought I’d try to add some more features to the system to improve the result.

The most obvious element that was not being utilised in the system was that the egg colour and texture is reasonably consistent from egg-to-egg and the immediate background surrounding the egg is generally dark.

I crop out the potential egg and it’s surrounds from the larger image, split it in the 10 regions (9 plus the ellipse) and take the median intensity of each segment.  That creates 10 features to the data.



Illustration of the segments of the cropped image that make up the features.

After extracting the features from the training data I reran the test set.

Train/Test data  10302 2576
True positive  0.993808049536
False positive  0.00621761658031

Now that is an improvement!  Given there was some data I wasn’t sure whether to consider true or false when classifying, that is a very good result.


The same test image as last post, no false detections.

It demonstrates once you have the classified data and the system set up, you can easily add features to improve the system.. It sure beats adding more if-then-else…



Egg Detection using an SVM

With the lizard hatching season underway I’m starting to get some meaningful data from my Incubator Monitor I blogged about earlier in the year.


Strophurus ciliaris geckos hatching hours apart.

To extend the project and to try some machine learning techniques in the “real world” I decided to add a feature where it could keep track of which eggs have hatched and detect a second egg hatching.  Typically the second egg of a clutch hatches hours or a up to a few days later than the first.

To that end, I decided to write a python module to detect the eggs in the image.  They are light coloured ellipses on a dark background so it shouldn’t be too hard.

Features of the eggs in the images:

  • Light eggs on dark background.
  • Elliptical shape
  • Vary in size based on species, time since laying and fish-eye distortion of the camera.
  • Variable orientation (vertical, horizontal, anything in between)

Let’s follow a typical object detection process:

  1. Preprocessing
  2. Feature Extraction
  3. Train a learning algorithm
  4. Label things!

Step 1: Preprocessing

Not much.  The camera is no mounted to the inside of the incubator lid with a constant IR-led lighting so I just read the images as grey scale and leave it at that.  I could add a compensation for the fish-eye and better lighting but this is OK for now.

Step 2: Feature Extraction

To start with I’m using a basic edge-detection to detect features in the image.  More modern algorithms exist but this is an easy to understand place to start.  These edges really give me two important things:

  1. A way to iterate through list of potential features.  Given the eggs vary in : number, location, size, orientation the edges tell me where to look in the image for interesting things.
  2. Features for the learning algorithm.  The edges can for a basis to extract information about any potential eggs (see below).

I decided to use edge detection and then matching the resultant contours (contiguous edges) to an egg shape.  I use the cv2 function fitEllipse to fit an ellipse to each contour.  If the contour looks like an egg, the fitEllipse will match it.

edges = cv2.Canny(self.img, self.edge_low_thresh, self.edge_high_thresh, 5)
im, contours, heir = cv2.findContours(edges,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)

fit_ellipse_i = cv2.fitEllipse(contours[i])


I then implement a course filtering on the resultant ellipses.  Only those that meet the basic criteria are kept:

  • length and width in a defined range,
  • The contour and ellipse match well.

This gives me a list of “possible” eggs.

I should note here that I originally though that this might be enough and I could just tune the various parameters to get an accurate list here but the two categories (egg, not_an_egg) were not easily separated using a series of if conditions.

This list of possible eggs provide me with some initial “features” for the machine learning.  The algorithm needs be the fed with a list of features that describe the egg detections in such a way that the algorithm can distinguish real eggs from false detections.

What makes an egg an egg?

  1. Egg shaped
  2. Within the size range of an egg


feature1 : The error between the detected contour an the fit egg shaped ellipse.

feature2 : The ratio between width and length

feature3 : egg width

feature4 : egg length

Step 3: Train a learning algorithm

I have a list of potential eggs and some features describing them.  Now comes the important and tedious part : training the machine learning algorithm with training data. This involved recording a bunch of images with different egg configurations and then classifying each detection manually as an egg or not_an_egg.  I took a number of images with each configuration as each gives slightly different data and then fed them through a program that displayed each candidate and let me enter y/n for each to classify them, remembering for a sequence of images where the eggs and not_an_eggs were to reduce the data entry (laziness is the mother of invention).

From this I found 12878 potential detections, 3318 eggs and 9560 not_an_eggs to use to train and test the algorithm.

I used the svm with the default Radial Basis Function (RBF) kernel provided by the sklearn python module to implement a Support Vector Machine learning algorithm.

sklearn SVM example:

I create a feature vector with 80% of the data to train the SVM:

 # extract 80% as training and 20% as test data
 self.train = [raw[i] for i in range(len(raw)) if i%5]
 self.test = [raw[i] for i in range(len(raw)) if i%5==0]
 print("Train/Test data ", len(self.train), len(self.test))

 #extract feature vectors from the dictionary of classified data
 self.X_train = [[x["error"], x["ellipse"][1][0], x["ellipse"][1][1],
        x["ellipse"][1][1]/x["ellipse"][1][0]] for x in self.train]
 #extract classifications
 Y = [x['classification'] == 'EGG' for x in self.train]
 self.Y_train = np.array(Y).astype(int)

 self.X_test = [[x["error"], x["ellipse"][1][0], x["ellipse"][1][1], 
        x["ellipse"][1][1]/x["ellipse"][1][0]] for x in self.test]
 Y = [x['classification'] == 'EGG' for x in self.test]
 self.Y_test = np.array(Y).astype(int)

 # scale the data with zero mean and std deviation.
 self.scaler = preprocessing.StandardScaler().fit(self.X_train)
 X_scaled = self.scaler.transform(self.X_train)

 #train an svm
 self.clf = svm.SVC(), self.Y_train)

Then running a test with the held back 20%:

X_test = self.scaler.transform(self.X_test)
self.pred = self.clf.predict(X_test)

self.True_test = [i for i in range(len(self.Y_test)) 
                  if self.Y_test[i] > 0]
self.False_test = [i for i in range(len(self.Y_test))
                   if self.Y_test[i] == 0]

print("True positive ", np.mean(self.pred[self.True_test]))
print("False positive ", np.mean(self.pred[self.False_test]))

To get this result:

Train/Test data  10302 2576
True positive  0.927244582043
False positive  0.0321243523316

This is better that I was getting with a sequence of if conditions and not too bad considering the simplicity of features extracted, the noisey background and that I have done no tuning of the SVM (C or gamma).

4. Label things!

Here is an interesting example. Note green ellipses are predicted eggs an red ellipses are predicted not_an_egg.

There is one false detection where the bottom of the container and the vermiculite seen through the side of the container create a reasonable sized egg that matches the ellipse pretty well.

Some eggs are missed due to the edge detection failing to separate them from the crud/container next to them – it needs enough separation from the background.  I can use segments of contours but that creates more false positives at the moment.


More potential eggs means more false positives.

It is a good first step with room for a lot of improvement. Stay tuned!

Where to find it

Check out the code here on GitHub: IncubatorMonitor




UV Meter

UV is important to reptiles (and people) in processing vitamin D3 which is needed to absorb calcium. Without it reptiles often suffer from Metabolic Bone Disease (MBD) which leads to weak bones and can be fatal.  Thankfully, pet shops have the answer, UV producing globes which can stave off the dreaded MBD.  These are in the form of T8 florescent tubes or compact florescent globes and go for $50+ each.  And, yeah, they need to be replaced periodically.    How do I know if I replacing a globe while it is still producing enough UV?  How do I know I haven’t waited too long?  The standard wisdom is to replace them every 6 to 12 months which can add up across multiple tanks not to mention the environmental impact of whatever goes in to making them.

Or, I can measure it.  UV meters sell for $400 but I thought I might be able to do something myself.

I had an Arduino Uno R3 and a TFT touchscreen lying around so I picked up a UV sensor and gave it a go.


The sensor

There are quite a few UV sensors around, most with a narrow frequency response.  While the wavelength of vitamin D3 absorption is around 300nm (UVB), UVA is also important for reptiles.  For this reason and to avoid missing the relatively narrow peaks in the output of commercial UV lamps I decided to go with the  ML8511 which seems to have the broadest wavelength band of the UV sensors I could find.  Reminder: Always read the datasheet, ideally before you buy it or you could be disappointed


ML8511 frequency response


Example UV lamp output (Exo Terra : Exo Terra CF lamp)

It produces an analog voltage relative to the UV intensity detected.  It is then easy enough to map the voltage to an intensity after sampling the analog voltage a few times to smooth it.  Some code was reused from a DFrobot example.

A quick walk outside and I could see the intensity to go from 0 to 3 mW/cm^2 which seemed reasonable for a mostly sunny winter’s morning in Melbourne.

Breakout for ML8511 from my usual go-to : ML8511 from core electronics


I used an Arduino UNO R3 clone and wired the sensor:

  • Power : 5V
  • Ground : Ground
  • Analog Output : A4 input.  (to avoid clashing with A0 to A3 which can be used by some of the display/keypad shields)

I soldered (well actually my friend did who has much steadier hands) it straight to the Arduino board to avoid the extra thickness of a prototype board.



Adafruit make these cool TFT resistive touchscreen shields for Arduino and I couldn’t resist buying one.  The screen update is pretty slow (visible flickering) when used with SPI bus but works really nicely in this project when I only redraw elements when they change.

The library provided with the screen and touch sensor work like a charm so integration was pretty straight forward.  It provides basic drawing functions like line, circle, rectangle and text but you need to build text box and button classes yourself if you want to simplify building a UI.

Adafruit TFT resistive touch screen available here.




The intensity varies (by the inverse square law) over distance so an intensity without a distance is not meaningful when judging to the output of the globe.  Instead of being forced to measure a distance each time I decided to measure the output right at the globe which involves sticking the device up and into each cage.  To avoid having to read off the screen while performing a yoga move I made the main output an output of the max output registered so I can put the device in the cage up to the globe and then pull it out to get a measurement.  I included a smaller instantaneous output and voltage output as well.  To reset the captured measurement I hit reset button on the screen.

The Max and Instantaneous values are displayed in a color indicating if they are in the range expected of a “good” globe.  Green for good, yellow for reduced and red for “its dead Jim”.

The tubes provide light across the length of the tube while the CFs provide a more concentrated source so the expected output for each is different at the globe.  To account for this the user can select CF or TUBE to change the assessment thresholds.

Finally, the zero mW/cm2 point seems to fluctuate a little (~0.02V) so I also include a calibration button which set the current instantaneous measurement to “0.0 mW/cm2”.



The Code

It is interesting writing Arduino code as it is C++ without the std template library (for memory reasons) so we have classes but no built in container classes for example.

It is a pretty simple application so I didn’t get too carried away.  I created a textbox class and then extended that to a button and provided just the visual customisation I needed (background colour, text colour & size, textbox size, textbox position).  I extended that for buttons with a simple isIn function for when the screen is touched and a selected flag to give it two states.

Check out the code here: UVMeter on GitHub

I update the display at 2Hz but only redraw objects that have changed, this avoids the flickering you get if you redraw objects each time.  Note, if a text value (i.e. a numerical display) is changing you need to redraw the background before rewriting the text as the text just redraws over the top leading to a mess of drawn pixels – hence a lot of text boxes with black background.

What is left

The device is not in a case yet and I currently power the Arduino with a mobile phone power pack.  I hope to build a case with a battery pack with 3D printing but it could take a while to get to the top of the list so if anymore is keen to do it let me know and I’m happy to be “first to be second” as they say.


This turned out to be pretty straight forward and useful.  Most of the time on the project will be in building a case for it!

Computer Vision – Egg hatching detection

The release of Raspberry PI 3 along with simple, low cost cameras makes entry in to camera motion detection simple and cheap.  For me this meant a handy (and cool) project to monitor my lizard egg incubator.

The Project

I have lizard eggs hatching a couple of times a week.  It is good to know when they hatch as I like to leave them in the incubator and then take them out after 24hrs.  Normally I check each day or two and then think, “did that just hatch or did it hatch just after I last checked?”.

I also wanted to trial the OpenCV image processing library as well as brush up on my Python.



I could try to train an algorithm to detect the various species of lizard in the image.  It could be done, and would be interesting, but a more straight forward approach is via motion detection.  The hatching process can take hours from start to finish so image to image at 24fps the changes will be very small, so how could it work?  Easy, take images every minute and compare it to a reference image; Use the difference between the images to judge what has happened.

Check out a timelapse video I created based on the images captured of a gecko hatching from it’s egg : Timelapse video

Part 1. Hardware

I used a Raspberry PI 3 to capture the images and do the processing.  It is perfect for the job : cheap, powerful enough for some processing, and has the libraries and ability to run python to make it all pretty easy.

The camera, I use a wide-angle IR camera with IR LEDs as it need to run day and night and ideally without visible light to disturb the day/night cycle of the lizards.  Make sure you get a wide-angle camera as the camera to object distance is small (~15cm)

Part 2. Image Capture

I won’t go in to detail about how enable the Raspberry PI Camera as there are plenty out there.  This is how I used it.  It can capture live images or run through a list of recorded files.  I record each image to allow for testing and training of the code.

Full code is available on GitHub : IncubatorMonitor code

 import picamera from picamera import PiCamera
 def go_replay(self):

    for f in self.files:
          (path, ext) = os.path.splitext(f)
          if (os.path.isfile(f) and ext == ".jpg"):
             img = self.process_image(f)
             print("reject", f, ext)
             # catch exceptions that might likely specific to 1 file.
       except (AttributeError, ValueError, TypeError, IndexError) as e:
       print("Caught", e, "while processing", f)


 def go_live(self):

    camera = PiCamera()
    high_res = (640, 480)
    camera.resolution = high_res

    while True:
       for i in range(1,10):
          filename = self.output_path + 'Incubator_'+ time.strftime('%m%d_%H%M%S')+ '.jpg'
          img = self.process_image(filename)

          print("Processed ", filename)

          #todo configure interval
          #split sleep to allow quicker interruption

       #write results periodically as it will probably end in tears (i.e. ctrl-C)
       print('Updated results')

Part 3. Basic Motion detection

This is where you go, “Oh, is that it?”.  It is much easier to write motion detection code in python.  The trick is detecting something useful and avoiding false-positives.

I use the Open Source computer vision library  OpenCV.  I’m barely scratching the surface of what can be done with it but if nothing else, it is a handy way to deal with images.

The hardest part is building OpenCV on the Raspberry Pi.  Here are instructions on how to do it.

Basic steps involved:

  1. Convert to grayscale.  Easier to deal with when only looking at a single intensity value per pixel.  Given the IR camera provides little real color detail we aren’t losing anything useful.
  2. Image difference.  Subtract one image from the other to detect differences.  The camera and the background is fixed so any change _should_ be interesting to us.
  3. Threshold the difference. There is some noise so the difference is never 0 for each pixel.  We use a threshold to detect changes of a meaningful level.  For this application a threshold of 15 to 20 is used.
  4. Dilate the thresholded data.  To smooth and merge (gross simplification) areas above the threshold perform a dilation step.  The size and content of the dilution matrix has an impact on how well this works.  I found a 9×9 matric with with MORPH_ELIPSE to work pretty well but could be improved on.
  5. Create polygons with each area above the threshold producing a list of objects to analyse.

Here is a snippet of the relevant code.  Again, full code is available here : IncubatorMonitor code

 se = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (9, 9))

 def __init__(self, blah):
 r = cv2.imread(self.path)
 self.raw = imutils.resize(r, width=500)
 blur = cv2.GaussianBlur(self.raw, (9,9), 0)
 self.processed = cv2.cvtColor(blur, cv2.COLOR_BGR2GRAY)

 def compareToReference(self, ref_obj, threshold):

    assert(len(self.processed) == len(ref_obj.processed))

    #compare to reference image = cv2.absdiff(self.processed, ref_obj.processed)
    # Check against threshold to find meaningful changes
    self.thresh = cv2.threshold(, threshold, 255, cv2.THRESH_BINARY)[1]

    # Run dilation to smooth and join nearby detections.
    self.dilate = cv2.dilate(self.thresh,, iterations=4)

    #extract a list of separate detections.
    (_, self.contours, _) = cv2.findContours(self.dilate.copy(), cv2.RETR_EXTERNAL,
    self.output = self.raw

    # Create a box for each found movement.
    for c in self.contours:
       (x, y, w, h) = cv2.boundingRect(c)

       _ = cv2.rectangle(self.output, (x,y), (x+w, y+h), (0,200,0), 2)

    self.ref = ref_obj

From this we end up with a set of images showing the difference between the reference image and the image being analysed.

The Image:


  1. The raw difference between the image and reference.


2. The difference threshold


Dilation expands and merges the smaller detections.


Look, we found it!



Part 4. Image Classification

In a perfect world, we can just say each of these detections above the threshold are hatching events.  In practice, there are other things that can happen to trigger a difference between the images: removing the lid which has the camera on it, the light in the room going on and shining through the incubator window, moving the camera, moving the tubs inside, schrodinger’s lizard etc.

In this case the easiest thing to do for some of these is to control the environment:

  • Fix the camera to the inside of the incubator.
  • Cover the viewing ports on the incubator to avoid stray light.

To make things more interesting, I decided not to, to make the the software “smart” enough to tell the difference.

It needs to classify each image in to 1 of 4 categories

  • No detection
  • Found hatching
  • Detected temporary disruption
  • Environment changed, reset reference image.

This is a typical “classification” problem in machine learning terminology.

Start with something simple, some basic logic to separate these cases.  What features can we extract to judge between these categories?

  • No detection
    • Very little difference to the reference image
  • Found hatching
    • There is a noticeable change in 1 specific location.
  • Detected temporary disruption
    • There is a broad change between the image and the reference or a large number of localised differences.
  • Environment changed, reset reference image.
    • The images are very different.

So, how well does it work?

In short, pretty well.  It never misses a hatching and always detects a big reference change.  The problem, as to be expected, is false positives of hatching detection that should be classified as a temporary disruption.  This happens once or twice a week.

The solutions:

  1. Keep manually “tuning” (tinkering) the parameters to tune them out.  Probably would work but doesn’t address the underlying fragility.
  2. Put in some memory.  Temporary changes are temporary and hatching is forever so if we have an awareness of what has happened in the past then we can separate these 2 situations.  I’ve done this and it works but it offends me slightly.
  3. Use machine learning techniques to make a smarter assessment.  This sounds like more fun so it is what I will do next.  Logistic Regression or SVG? Neural Network?  We’ll see…

Part 5. Notification

For fun, I decided to tweet the notifications to my twitter account.  This is really straight-forward in python.  It can either send my a DM or post an update.

import tweepy
import json

class IncNotify():

def __init__(self):
# twitter_auth.json File Format
# "consumer_key" : "blah",
# "consumer_secret" : "blah",
# "access_token" : "blah",
# "access_secret" : "blah"

   twitter_auth = json.loads(open('twitter_auth.json').read())
   # Configure auth for twitter
   auth = tweepy.OAuthHandler(twitter_auth['consumer_key'], twitter_auth['consumer_secret'])
   auth.set_access_token(twitter_auth['access_token'], twitter_auth['access_secret'])

   self.api = tweepy.API(auth)

   ## TODO handle failure
   print ("Twitter output enabled")

def notify(self, img, level):

   # TODO img is correct format.
   if self.api is None or img is None or level == 0:
      return None

   msg = "The incubator monitor has detected a gecko hatching. Did I get it right?"

   if level <= 1:
      self.api.send_direct_message(user="namezmud", text=msg)
      path = img.output_path
      if not path:
         path = img.path
      self.api.update_with_media(path, msg)

   print("SEND!!!! " + img.getShortname())

Conclusion (so far)

There are still plenty of TODOs in the code and extensions that could be made but it works for now and I’ll set it up live for next breeding season.  Plenty to work on in what spare time I have.

If you are interested in giving it a go, feel free.  All the code is on github here for public consumption. IncubatorMonitor

The aftermath

Within the space of a week I had my PC stolen in a burglary and snapped in half the micro SD card in the raspberry PI.  These were where the images and backup for most of the hatching season were stored so I lost all my unit tests and most of my training data.  We’ll have to see if I have enough data left to train a machine learning model of if I have to wait until the end of the year to collect more data.  If there is anyone in the northern hemisphere breeding this summer and are interested in setting up a camera, please get in touch!






Yet Another Monitor

It is probably the most obvious and straight-forward IoT project there is but for me it was a chance to try some technology I haven’t had a chance to use – the ‘ol ‘monitor the temperature and display some pretty graphs’ project.  Much of what I have used is overkill for the need but provides a good base of knowledge for something bigger.

The Project

Monitor the temperature of up to 20 reptile enclosures, logging the data to ensure an optimal environment, particularly during the winter cooling period and correlate successful breeding with winter temperatures.

The Solution

Using every buzz-word in one project… IoT, “the cloud”, AWS, Python.

YAM (1).jpg

I’ll go in to more detail on each part in future posts to set the scene this is how it went.

Part 1. Sensors

I need around 20 sensors spread over two sides of a room so the DS18B20 sensors in a waterproof housing are a good option.  Cheap (on ebay) and communicate via digitial signal over a 3 wire bus (incl. power & ground) so less wiring and no noise/voltage drop issues from an analog sensor.

Part 2. Arduino / Particle Photon

Two of a number of options for wifi connected micro-controllers.  Read the sensors from the bus on a digital input and broadcast through MQTT. The use of MQTT means any number of devices can be connected to any number (within reason) of sensors without the upstream processing needing to know or care.  JSON allows data to be encoded in a human readable and easy to integrate in python format.  Sure the payload is larger than a binary message, but in this case, who cares?

Sneak-peek : The Photon is much better than the arduino in this context.  The new raspberry pi zero W would also do the job with a 5 line shell script at AUD$15.

Side-note: Am I old for realising the Pi Zero W is more powerful than the first $10k I spent on computers?

Part 3. MQTT Broker

Handy gateway to “the cloud”, down-sampling data, adding sensor -> location mapping plus security and buffering to the MQTT connection to the cloud.

Part 4. Storage

A database as you expect.  NoSQL and in the cloud : because I can…  And so I can access the data outside my network without opening up my ports.

Part 5. Display

On the web, using python and one of the many charting libraries, hosted in the cloud.  If 100,000 people want to view the temperature I keep my S. ciliaris at they can, if I pay for the capacity…. And again, I don’t need to open my ports.

This was an opportunity to try flask and bokeh python libraries for web-framework and charting respectively.  Django and google charts would be fine too.

I haven’t put much effort in to the display so far so it is pretty basic and clunky but works as a proof of concept.

Check it out for yourself live : YarraRiverReptiles Live!




This is still a work in progress, things to clean up and features to add but it is a start.  If you are interested in more details, check out the follow up posts over the next few weeks or get in touch.


I have created this blog to bring together and document some projects I have been working on with the hope it will provide some assistance, code samples and inspiration to people trying to build some cool projects.  I’m a software guy by trade and have an interest in low-power, low-cost platforms in what would now be called the Internet of Things (IoT) as well as machine learning and of course reptiles (hence the blog name).

There are plenty of API references and tutorials available which I am not going to try replicate, instead I’ll cover the various parts to the projects I have built with references to working code and a few tips and gotchas and hopefully a few references to good work I came across along the way.

Enjoy and get in touch if you want more information.

Gratuitous lizard shot:



Note: All images used are by me and are available under Creative Commons BY license. CC-BY_icon.svg