I'm working on a few mini-projects that I'll post about over the next few weeks:
👟 Product Type Classification
As part of SYDE522 Machine Intelligence, I trained a ResNet-18 convolutional neural network (CNN) using the UT Zappos50K data set to classify footwear as sandals, boots, slippers, or shoes. The model has an accuracy of 96%.
📷 3D Models from 2D Photos
Last week Apple released the Object Capture API, which makes it pretty easy to turn 2D photos into 3D models via photogrammetry. I'm synching the sample capture app with the RendAR turntable over BLE and testing it out.
✨ 3D Models from LiDAR
I'm super grateful to have received the Apple Award for Overall Presentation Quality at my capstone symposium. The prize was an iPad Pro, which is equipped with a LiDAR sensor. This gives me another way to capture 3D models.
🪛 3D Printer
I was also honoured to receive the Quansar People's Choice Award as voted on by my classmates 🙏 and put the cash prize toward a Prusa i3 MK3S+ 3D printer. First step is assembling it.
Post 1: Introducing RendAR
Post 2: The Turntable
Post 3: Capture Rig Lighting
Post 4: Motor Control
Post 5: System Integration & Bluetooth
Post 6: Scale & iPhone App
In addition to photos and mass, the RendAR system measures product dimensions (ostensibly) for shipping. I'll come clean though -- dimensions are not a Shopify field 😱. I just wanted to play around with ARKit. Admittedly, this is also the first explicit nod to the "AR" in RendAR. More to come on that.
ARKit is an iOS framework that lets developers build augmented reality (AR) experiences. It uses camera and motion data on your iPhone or iPad to deliver capabilities like depth, scene geometry, and motion tracking. Why does that matter? AR overlays digital information on your view of the real world. To do that, your device needs to understand your physical environment. ARKit provides that understanding.
Measuring dimensions with your camera is a simple, demonstrative use cases for ARKit. If you have a device with iOS 12.0 or later, you already have Apple's Measure app. There are plenty of others available in the App Store too. So how do they work?
From a user perspective, most AR measuring apps work like this (example image below):
Under the hood, there are two main processes enabling the measurement: visual inertial odometry (VIO) and ray casting.
It Video Inertial Odometry (VIO)
(Coming soon...distracted by another side-project 🤖)
(Coming soon...distracted by another side-project 🤖)
Here's a capture demo that includes dimensions measurement (starting at 0:20). The demo integrates an open source AR Ruler app into the RendAR app. As you can see, measuring dimensions involves active participation from the user, so it takes away from the automatic-ness of the capture. It also makes the measurement prone to user error. Not ideal but I had fun experimenting with ARKit.
In a future iteration, I could potentially use computer vision to add bounding boxes to the product images and then perform ray casting on the bounding boxes to determine dimensions. This would integrate dimension measurement into the automated capture without needing the user to do anything extra.
This week Apple released an Object Capture API that lets you easily construct 3D models from 2D images through a process called photogrammetry. If photos come from an iPhone of iPad, they also contain depth data, which can be used to calculate the object's real-world size. I'm experimenting with Object Capture and will report back.
I won't have a post up this week but here are a couple sneak peaks at my RendAR promo video 🎬 and poster for the virtual Capstone Design Symposium next week at the University of Waterloo..
Post 1: Introducing RendAR
Post 2: The Turntable
Post 3: Capture Rig Lighting
Post 4: Motor Control
Post 5: System Integration & Bluetooth
In my last post I said I'd either talk about camera integration or load cell calibration next. Well hold onto your hat because they're both working now. Jumping right in, here's a demo capturing my running shoe. The app UI is shown on the left. (Note: this demo doesn't include image enhancement / background removal.)
I designed the iPhone app user interface (UI) in Figma and then built it in Xcode. The UI guides the user through a sequence of steps:
For those who care about software implementation, the app is built using the standard model-view-controller (MVC) design pattern. In step 1, a BLEManager class is initialized. The BLEManager is passed from view to view to maintain the Bluetooth connection. In step 2, a Product class is initialized. The Product object is passed from one view to the next to populate the data fields, including name, mass, and images. Eventually the product object will be sent to the cloud for image processing (enhancement + background removal) and then to the retailer's Shopify store.
Side note: Designing the UI was fun! Back in the day, I was a bit of an art kid so it's nice to remind myself of that. In the spirit of shameless self-promotion, here's some evidence of my old art skills. I won my high school art contest without entering for the painting and you better believe I'm still smug about it. Thanks to my grade 12 art teacher, Mrs. Skol, wherever you are.
The mass measurement and image capture happen in step 5. The iPhone and capture rig talk to each other over Bluetooth to synchronize their actions. This state diagram shows the states (blue), events (green), and conditions (black) that dictate when things happen:
Load Cell Calibration
The load cells were a pain in the butt to get working but I only have myself to blame. The capture rig has four load cells and I tried to calibrate them together. That didn't work very well — mass readings were inaccurate, noisy, and unreliable. I did the logical thing and avoided calibrating the load cells again until right before a demo with my professors. This timing wasn't totally by choice. I physically broke my microcontroller around this time and had to wait a few days for a new one to arrive 😬. Fortunately, once I sucked it up, disassembled the turntable, and calibrated the load cells individually, they worked great. Lesson learned -- procrastination works, kids 😉. Just kidding, don't recommend.
I also discovered that one of the load cell amplifiers is a dud. When a load cell supports a mass, it bends ever so slightly. This deflection changes the length of strain gauges within the load cell, which changes the resistance in the circuit, which changes the voltage signal read by the microcontroller. The voltage changes are so small that they must be amplified to be meaningful.
I removed the faulty amplifier and load cell leaving the capture rig with three functioning load cells. Not ideal in terms of balance, but it works. Here's a demo showing the accuracy of the load cells.
You may have noticed a couple of "Coming soon!" fields in the Capture Summary screen. Those fields are Type and Dimensions. I'm working on measuring product dimensions in the app using ARKit. I'm also taking a machine intelligence course and piggybacking my course project on my FYDP. I'll be training a model to populate the product type field.
Up next: Probably measuring dimensions with ARKit 📏
It's time to put the pieces together! So far I've built separate parts of the RendAR system like the turntable, softbox, and motor control loop. Now I'm putting them together physically and with code.
First, I attached the lid to the turntable housing using a piano hinge and lid stays (things that hold the lid up). I wasn't sure how well the lid stays would work, but they turned out perfectly. They have enough friction to hold the lid vertical while being easy to open and close.
Next, I wired everything together and ran a few tests to check the connections. Everything worked as expected...except for the button. Yes, the button — the simplest part of the whole thing. Eventually I realized I needed to configure the button pin on the Arduino with a pull-up resistor and then it was peachy.
Why is there a button? I asked myself that question when it wasn't working, but there's a good reason for it. When I researched competitor products, I read reviews for a bunch of photography and 3D scanning apps. One complaint users had was that they'd line up their shot, press the shutter button in the app, accidentally move their phone, and have to start over. A physical button (with a pull-up resistor...important detail) prevents this problem. Once your camera is lined up, you don't have to touch it again.
Here's how the rig looks! I'm debating whether to paint the turntable disc and lid to match my initial design but that's a future-Laura problem. For now I'm enjoying the Scandinavian aesthetic. The phone mount isn't shown here, but it attaches through the hole in the front.
Bluetooth is a type of wireless communication that lets devices talk to each other across short distances. Data is sent between devices via radio waves oscillating at 2.4 GHz. That's 2.4 billion waves per second! In RendAR, I'm using Bluetooth Low Energy (BLE) to connect the capture rig and iPhone app, which uses less power than traditional bluetooth.
You can think of BLE like the community bulletin board at your local coffee shop. In BLE jargon, the bulletin board is a peripheral device that shares information. Each flyer on the bulletin board is a service and each service has characteristics. For example, the bulletin board (peripheral) might have a concert flyer (service) listing date and ticket price (characteristics). As an observer of the bulletin board, you are a central device. You look at the bulletin board and read the information you care about.
For RendAR, the capture rig is the peripheral device. It has a capture service with state and mass characteristics. The iPhone is the central device and reads data from the capture rig peripheral. For example, the capture rig will update the state characteristic to say the turntable finished rotating. The iPhone reads the state and knows it's safe to take a picture. The iPhone can also update the state characteristic after taking the photo, letting the turntable know it can rotate again. This is how the turntable and camera synchronize.
BLE Communication Test
I made a bare bones iOS app to test bluetooth communication between the capture rig and iPhone. I'm using the ArduinoBLE library on the capture rig side and the Core Bluetooth framework on the iPhone side. Here's a split screen demo with the app shown on the left. In the demo, I'm pressing the Photo Trigger button, which is a placeholder for the actual camera shutter.
Disclaimer: I don't actually have incredibly light running shoes, I just haven't (successfully) calibrated the load cells yet.
Bluetooth is named after King Harald "Bluetooth" Gormsson, who united Denmark and Norway in 958. He was known for having a discoloured tooth and will be for all eternity. Jim Kardach, an engineer at Intel in 1997, was reading a book about King Gormsson and liked the name since bluetooth unites devices. The bluetooth icon is composed of the Nordic symbols for H and B in honour of King Harald.
Up next: Either load cell calibration ⚖️ or camera integration 📸...it'll be surprise for both of us which one works first.
Back in the day, I did a degree in kinesiology and learned about something called proprioception. Proprioception is how you know where your body is in space. Try this — close your eyes and touch your nose with your right pinky finger. Easy peasy. Why is that? How does your finger know where your nose is? The answer is proprioception.
Long story short, we have little stretch sensors in our muscles called muscle spindles and Golgi tendon organs. They're embedded in muscle fibres and tendons and send signals to the brain as they're stretched and compressed. Our brain interprets these signals to understand where the muscles are located compared to the rest of the body.
So when you go to touch your nose, your brain a) outputs signals telling certain muscles to contract, and b) receives feedback from stretch sensors about your body location. This tells your brain where your pinky is versus where you want it to go. Controlling a motor works basically the same way.
A little to the left.
In the RendAR system, a motor rotates the turntable using closed-loop feedback to capture product photos at specific angles. This is important because it ensures consistency across a retailer's online catalogue. It doesn't look good to a customer if you have a page of products facing different directions.
For RendAR, a microcontroller is the brain, a motor is the muscle, and a motor encoder is the muscle spindle. Closed-loop feedback is to machines what proprioception is to humans.
To start, let me tell you what we're working with. The motor I'm using to rotate the turntable is a 12V, 16.7 RPM DC motor with a rotary encoder. The main factors that went into choosing the motor were RPM and power.
The turntable needs to rotate pretty slowly in the range of 5-10RPM. For reference, a microwave rotates at about 6RPM. I control the speed of the motor using pulse-width modulation, but it's better if the motor is low-RPM to begin with.
I also need a 12V motor that draws less than 2.25A to fit within my power budget (power = voltage x current). Finally, I wanted the motor to have a built-in encoder, which is a sensor that measures how far the shaft rotates.
Controls Appreciation Paragraph
Automatic control systems has to be one of coolest things I've learned in engineering school. It's also the subject I knew nothing about beforehand. Now I think controls is kind of like peak-engineering. It combines physics, mathematical modelling, mechanical engineering, electrical engineering, simulation, and software.
In a nutshell, controls is how you make a dynamic (moving) system do what you want it to. Control systems engineering is how NASA can land Perseverance on Mars, SpaceX can launch and land rockets, and Boston Dynamics can make robots dance.
And now, controls is how I accomplish the infinitely less complex but deeply satisfying feat of making a motor rotate exactly 90 degrees.
There are two types of control systems — open-loop and closed-loop. In open-loop, you send a signal to an actuator, cross your fingers, and hope for the best. In closed-loop, you measure the output of the system and compare it to what you wanted to happen. If there's a difference between your input and your output, you try to correct it. This difference is called error.
RendAR uses a closed-loop negative feedback system with a PID controller. Negative feedback means the controller aims to reduce error, as opposed to positive feedback, which increases it. The PID controller is a mechanism that calculates how to reduce the error. It says, "Given this error...*does math*...send this voltage to the motor."
Here's a block diagram of my closed-loop negative feedback system:
Now for a side-by-side demo! Here's a comparison of the motor rotating with open-loop and closed-loop feedback. In the open-loop system, you can see that error accumulates with every rotation. In the closed-loop system, the motor rotates 90 degrees each time.
This is what happens for each rotation of the closed-loop system:
The PID controller determines how steps 4-6 pan out. The parameters Kp, Ki, and Kd shown in the diagram above can be varied to adjust how quickly the motor reaches its reference angle, how much the motor oscillates before settling, and how much error remains when the motor reaches steady state and stops moving.
I tuned the values of Kp, Ki, and Kd through trial-and-error to find ones that produced smooth rotation without any overshoot and oscillation. Simple but satisfying!
Up next: Connecting the capture rig and iPhone via Bluetooth
Another find in my Dad's workshop — a day-of-the-week clock. What a concept. In fact, a very useful concept when you're working in a basement during a pandemic and the days start to blend together.
This clock used to live in my Grandpa's workshop before he passed away in 2014. He knew how to build everything. Here's a photo from when he helped me tear my room apart and put it back together when I was about 14.
What's a Softbox?
Lighting is one of the hard parts of DIY product photography. One retailer said they photograph products in the same spot at the same time of day using only natural light. Rainy day? No bueno.
To solve this problem, the capture rig includes a built-in lighting solution. The lid of the capture rig functions as a softbox, which is a go-to light source in photo studios. A softbox provides soft, even illumination and usually looks something like this:
I started off building an open plywood box and lining it with tinfoil to make the inside reflective. Next, I cut and placed strips of LEDs with a colour temperature of 6000K (daylight white). There are close to 600 LEDs in this bad boy.
I wired up the the LEDS in parallel to reduce the amount of current going through any single strip. Soo much soldering. Electrical tape prevents the wiring connections from shorting. One side of the circuit connects to power and the other to ground and then out to the wall adapter through a barrel connector.
Moment of truth...it turned on! And it's BRIGHT. It was satisfying to see it work on the first try after the lengthy assembly process. Electrical debugging avoided, phew.
The LEDs are a little too bright and harsh on their own. To get soft, even illumination, I needed to diffuse the LEDs. I ordered white "diffusion fabric" off of Amazon. This is the kind of material used in actual softboxes to scatter light.
I cut the fabric to size and stapled it to the softbox frame. It took three layers to achieve the kind of glow I wanted. Finally, I added plywood edge banding to tidy things up.
In hindsight, it would have been cool to make the brightness of the softbox "smart." I could have included an ambient light sensor and varied the LED voltage automatically to suit lighting conditions. For now, there's one level...💡BRIGHT💡.
Up next: Controlling the motor 🤖
Check out my first post for context: Introducing RendAR
I'm writing this from the workshop at my parents' house. Campus labs and workshops closed for the lockdown in Ontario so this is where I’m posted up. Lucky for me, I have very handy parents who don’t mind having their adult kid around.
It’s a pretty sweet workshop. Exhibit A — this is a jigsaw from the 1930s my dad bought at a garage sale when he was 25.
And just because, here’s me in my dad’s workshop growing up. Same vibe and same lamp and jigsaw in the background.
So back to building…
When I think of a turntable, I think of a record player. That analogy stuck and became the inspiration for the capture rig form factor. Functionally, there are a few things the capture rig needs to do:
I ordered parts before Christmas and started assembling the base in early January. The turntable/scale subassembly includes four load cells, a DC motor, a Lazy Susan bearing, and a platform for the product to sit on. The subassembly is isolated from the rest of the capture rig to ensure the full product mass is measured by the load cells. The housing goes on last and is easily removed so I can access the mechanical/electrical parts as needed.
Originally I planned to 3D print and laser cut some parts, but I pandapted (pandemic-adapted) the design to rely on woodworking given the tools I have. As a result, this thing is SOLID.
The capture rig is powered by a wall adapter and controlled by an Arduino Nano 33 BLE. Once I got everything wired up, I tested it out with the help of my 8-year-old niece, Evie. Just a couple of school kids learning remotely! We got the motor to rotate and the load cells to give uncalibrated readings. (The platform isn't screwed to the bearing here, so it looks a bit wobbly.)
Quote from Evie after filming this:
I can't believe you did so much work and that's all it does.
Just over here inspiring the next generation 🥲.
A few things I learned from assembling the turntable/scale:
Up next: how I built the softbox lid 💡
Hey, welcome to RendAR! I'm Laura, a fourth-year Mechatronics Engineering student at the University of Waterloo. I'm building RendAR (partly) for my fourth-year design project. I researched and designed RendAR during fall 2020 and am in the process of building a functional prototype this winter.
So why RendAR?
Last summer, a friend of mine helped a retailer set up an online store for the first time. Like many businesses, the retailer felt they had to be online to survive the pandemic. His experience taught me about some of the hard parts of managing an online store.
Think about the last time you bought something online. You probably don't have to think too hard. For me, it was two days ago. Unsurprisingly, the COVID-19 pandemic accelerated the already increasing trend of Ecommerce. In 2020, Ecommerce retail sales in the US jumped a whopping 30%.
Brick-and-mortar retailers have been pushed to move their stores online or risk going out of business. In the words of one retailer I spoke to,
Stores without an online footprint are on palliative care.
One of the primary platforms enabling this transition is Shopify, which powers over 1,000,000 businesses worldwide. Shopify is a game changer for merchants, no doubt. But after talking to a few retailers, I learned they still have challenges. Two big ones are product photography and data entry.
It turns out taking great product photos isn't that easy and nobody likes tedious admin. This is especially true for non-technical retailers who are already stretched thin. They can either learn to do it themselves, which takes time, or hire someone, which takes money.
Retailers need an easy and inexpensive way to capture quality, consistent product photos and data and add them to their online store.
So that's what I'm working on. An interesting thing about these two problems (product images + data entry) is that they have something in common -- they're repetitive. This means they're candidates for automation, and that's what I aim to do with RendAR.
RendAR automates product photography, data entry, and Shopify integration.
RendAR lets retailers digitize their products and get them online with the press of a button, no technical skills required. The system includes an all-in-one capture rig with lighting and a motorized turntable, an iPhone app, cloud-based image processing, and Shopify integration.
It works like this:
In this proof-of-concept, 2D photos are a starting point, but could definitely be expanded to 360 images and 3D captures. Similarly, I chose to measure mass and dimensions since that info is needed for shipping, but other data types could also be captured. In the future, more (all?) Shopify product data fields could be auto-captured so that RendAR completely digitizes the product. Fully digitizing products opens up ideas for other cool/useful applications...but that's for another day.
Building RendAR gives me chance to design and build an end-to-end system and play around with things like mechanical design, motor control, sensor calibration, bluetooth, iOS development, image processing, UX, and APIs. I'll share some of that here in future posts.