Slides from our talk at Cascadia Ruby 2014
Hsing-Hui Hsu and I gave a talk at Cascadia Ruby two days ago about our experiences developing our capstone project from Ada Developers Academy and the journey of junior developers coming out of an alternative learning program. Here are our slides and a rough script of our talk. Enjoy!
L: Hi, Welcome to our talk, “Good luck with that: Tag Teaming Civic Data”, I’m Liz Rush!
H: And I’m Hsing-Hui Hsu, and we are members of the first cohort of Ada Developers Academy... H: ...which is a non-profit code school in Seattle focusing on training women making career shifts into tech. It focuses on web development using Ruby on Rails, as well as general software engineering principles such as software architecture, test-driven development and agile. It is a year long program split into two 6 month phases. The first 6 months consists of in-class instruction, and the second is an internship at a Seattle tech company--which ends October 27th at which point we will be available for hire. * cough cough *
L: Our 6 month classroom experience was based around Rails projects in short one or two week sprints, culminating in a final 4 week long self-directed capstone project. Hsing-Hui and I had previously worked together on a two week rails app, in which we started to discover where our individual interests and strengths lay.
L: So, I know this isn’t as much of a problem here in Portland, but if y’all have ever tried to park in Seattle you’ll know this game!
H: Our street parking signs are nearly impossible to decipher. In this particularly egregious example, it appears as if every sign contradicts the one before it- and as a final, cruel joke- they tacked on “west of here”- when from the direction this photo was taken, we are East of all of these signs. So where does that leave us?
L: After hitting this issue ourselves, we thought this might be an interesting problem to try to solve for our capstone. We had been granted a week of “free play” in Ada before the Capstone started, which I spent doing a research spike. Simply enough, I googled to find out if someone else had already solved this problem for me, and while there were no mobile apps, it turns out the city of Seattle in fact has a parking restrictions map on their website! I investigated the different data sets available through the City’s Open Data Initiative and I also managed to get a contact at the City of seattle- not in their parking or transportation department, but a data engineer on their web team.
H: So this is the way the Seattle city government implemented a map showing parking information. They use png overlays on map tiles to show the parking categories. As you can see, it’s a bit hard to read, the categories aren’t necessarily evident, and the data is static. We thought we could do better. Our idea was to use the open data to display the parking information in a simpler, more human-friendly way -- for example, rendering just green lines for available street parking and red ones for no parking.
H: So we were pretty excited. There was open data available, we had a contact in the city, and we had a starting point to improve upon. Morale was high. We had discovered our manifest destiny, and we sallied forth. But, knowing that this would require some help from the City of Seattle government, we called our contact.
L: We explained our idea and what we thought we could do, hoping to get some feedback or tips as we embarked on our journey. He said, “well, if you think you can build it you’ll be the first. Good luck with that. Ha ha ha ha. Ha ha. Ha.” and then he hung up on us.
H: In spite of this ringing endorsement, the guy did provide a link to their ArcGIS server, which was what their web application was using to generate the parking maps. Though we had never used an ArcGIS server, the Esri website told us about all the tools ArcGIS provided, including a way to dynamically generate the KML files we needed for our project. So, perfect! Except...
L: Turns out that our perfect resource wasn’t so perfect after all. As you can see, there were quite a few implementation problems. almost every informational link was a 404 and many of the links produced empty files that we couldn’t use to extract data from. At this point, not knowing what we were doing, we reached out to other developers on Twitter and got back a link to a fully implemented demo arcGIS server from a developer at Esri. We found out there were actually a lot of other parts to an ArcGIS server, including projection tools and coordinate conversions. Since we didn’t appear to have this on the Seattle MapServer, we called up our city contact again who told us that the person who originally implemented the product was no longer working there, they had no idea what was going on with the mapserver, and that they’d be unable to give us any guidance or help.
H: As we mentioned, the city used PNG overlays. But in order to implement the features we wanted, we decided to use KML files, or keyhole markup language files, which are an XML-type formatting for geodata used by Google Earth. When we did try to get a KML file from mapserver... H: ...this is how it actually rendered.
H: At this point, with no luck getting the KML files from the mapserver and two weeks into the four weeks we had for the project, we were starting to get worried that we wouldn’t finish in time. So we compromised on our original idea and decided to use PNG overlay instead to get a minimum viable product. We thought that this would be much easier to implement, since that was the strategy of the Seattle government’s site. L: Here is the png overlay we managed to generate from the map server! We were working on this project from downtown Seattle, so after a little bit of a struggle to figure out the query parameters, we were excited to see our successful generation of the png. We forged ahead for the next week, building out our backend and figuring out what to do with all these overlays when eventually we discovered...
L: ...that the graceful fail of a query looked exactly like the success!
H: Faced with this problem, we decided to snoop around the city gov’t’s code to see how they handled this issue. And the answer was...
H: ... they didn’t. As you can see, they’ve simply put in a comment in their else block that says “We didn’t get any parking results so… don’t do anything at all”
L: And here is my favorite comment: “What do here? TODO”. Which we realize is silly and we’re not trying to tear down the city’s code, but at this stage of our project after so much frustration, it was actually sort of comforting and validating in a way to realize that even Real Developers don’t know what they’re doing.
L: And one last gem: “This caused great grief”. We feel you, guys, we feel you.
H: So we eventually got to the point where we were getting the correct map that corresponded to our bounding box queries, and we decided to use the googlemaps API as the base layer.
H: However, we ran into the problem of the overlay not lining up with the map.
H: This was pretty perplexing--we were working under the assumption that whatever Google was using to display their maps, everyone else on the internet should be using it too, right? Well, it turns out, there are many different kinds of map projections, and different regions often use different systems. We ended up on a choose-your-own adventure sort of trail researching different kinds of projections and spheroids, and learned that many civic mapservers actually use flat map projections, which become increasingly inaccurate the further away you get from some central reference point. The Seattle government, for example, stores their geoinformation using a flat map projection specific to the Washington North State plane, whereas the googlemap API displays their maps using a spherical Mercator projection more suited for global geodata.
H: So basically we were trying to lay a flat map on top of a round map, and lost a lot of time trying to figure all of this out. Even when we got the correct spatial reference ID, we couldn’t figure out how to solve the skewing. After beating our heads against a wall, we figured out the solution...
L: ...by accident. I just happened to be messing around with the spatial reference codes and one afternoon typed the wrong number in the right box and ended up getting exactly what we wanted- despite it being opposite of what all the other examples of similar queries we saw on the internet were.
H: Now that we have PNGs, we had to figure out how to store them. Since we were anticipating building out both a web frontend client and an iOS app, we didn’t want to overload the city’s mapserver with requests. So we decided to separate our project into two separate Rails apps--a backend API that would actually make the request and cache the data from the city, which would serve a front-end web client. We used the Carrierwave gem to cache the png files in an Amazon S3 bucket, but since that was taking around 15 seconds, we set that as a background job and sent the client a tempfile. Then, any other request made with the same bounding box parameters would receive the cached file from the cloud.
L: At this point, it’s the end of April, we’ve just wrapped up the classroom portion of Ada, 6 months down and 6 months to go. We’ve reached the Chimney rock of our Ada journey to become developers!
L: We presented our app in front of Ada investors, our mentors & future bosses at our internships as well as friends and family. Additionally, we we also invited to present along with two other students who did civic data related projects at a local meet up for civic technology enthusiasts. Unfortunately we didn’t have a great experience. While most responded positively, we did get a bit of discouragement from spectators who wanted to tell us all about everything we were doing wrong with our entire lives.
H: We also presented at another local meetup, where a guy who tried to break the app during presentation. Which, on the one hand, rude, but on the other hand, it was like we were being viewed as an equal to this professional developer. It felt like we were in an awkward transition phase between beginner and junior developer, and it was hard to figure out the balance.
L: One thing we found extremely beneficial, not just for this particular project, but for making our transition from students to real world development in our internships was mentorship. We were lucky enough to have a great contract iOS developer, who was super enthusiastic about our project.
H: Her brother.
L: Yeah, my brother, John Rush, who we paid solely in top pot donuts and coffee in exchange for his computer science knowledge and experience developing and planning iOS apps. Because we knew that this product is really a mobile product, we came into it knowing we had a large blind spot when it came to iOS development. John was there to guide us while still letting us decide both the direction of the project and the design & user experience of the mobile app.
L: Additionally, I had two mentors from Substantial, Mark Kornblum and Robin Clowers, who were mentoring me during the month of the capstone and who have remained my mentors for the last 3 months during my internship there. Having the continuity of mentorship over the course of a larger project turned out to be incredibly valuable. We had experienced developers to ask questions, like “how do we use MongoDB”, but also they were there to put our setbacks into perspective and give us real constructive criticism. Instead of just a one-off question we might ask say, on twitter, we had these valuable resources that not only knew what our current skill level was, but knew the whole history of the project and how to help push us in the right direction while maintaining distance from the actual code. We believe the support from mentors at Substantial as well as having my brother helped keep us motivated to pursue the project after Ada ended and it also made my internship onboarding go more smoothly. However, Hsing-Hui had a different experience.
H: Yeah, while Liz had a lot of support at her internship, at my company, lack of structure, management, and support seems to be the norm. My company, EMC/Isilon, which does essentially no web development, is used to employees who come from traditional college CS backgrounds. So it was hard to navigate the transition to a different area in tech, and be a representative of an alternate learning paradigm. The learning curve was much steeper (felt like starting from square 1), and I had less support. For example, one of the members on my team came to the Code for Seattle presentation, his first reaction was “that was better than I expected.”
L: So how does this all relate back to this project? Well we discovered what we believe is a hard lesson for developers: code is actually a very small part of what makes development difficult. We had to start to figure out what we wanted our product to look like, what we thought success would look like, and how to work together without a hierarchy. Part of this involved scheduling actual workdays to meet and work face to face, using pivotal tracker to ensure we were tracking our progress, and the most important lesson: don’t code when you’re hangry.
L: We learned this the hard way as you can see in this action shot of Hsing-Hui and me!
H: So now with Liz firmly established as, you know, my boss, and also equipped with the mentality that we might actually be becoming real developers, we wanted to continue phase 2 of our app following our original idea of manipulating the lines on the map overlay itself.
H: We finally found a KML file of the entire city on the open data website and managed to parse it, only to discover that it did not contain all the data we needed.
H: We then discovered that there was a different file format that did contain all the information we needed. These were shapefiles, which are another Esri tool that stores geodata in a digital vector storage format, but it isn’t human-readable and requires some third party library to decipher. We tried asking around for which gems would be best suited for the job, but the open source gems (rgeo and georuby) weren’t sufficient, and while there was a gem being developed by Esri, it is apparently not yet ready for release.
H: Then, we finally found a solution. It was...
Audience Plant: Boo!
H: Turns out Python libraries are really good at math, and handled both reading the shapefile and converting the data to the correct projection. Though we wanted to keep our API a Rails app, we thought we could make this work and integerate a python script as a separate cron job on the server and feed the shapefile data to Rails...
L: But it turns out the data was bad anyway. We were told the data from the city was being maintained and was updated every day, but on the open data website it turned out the last time it was updated was in 2011.
H: So much for that idea. So we decided to go back to the ArcGIS mapserver. That’s right....
L: ...This guy.
H: So in order to connect the public data to the city’s mapserver, we used some of the information about the geometry objects provided in a CSV on the open data website, such as the Unit ID number, to query the ArcGIS server directly.
L: As Hsing-Hui mentioned, we are now implementing a solution which uses both the Open Data initiative and the MapServer on the government’s site. This means that even though we started our project architecting a way to play nice with the City’s server and avoiding hitting that all the time, now, every time we wish to get the more up to date data, we will have to scrape for our data by querying until we hit the result limit- this means that we’re potentially going to have to query 2,300 times to ensure we have the newest data for all parts of the city.
H: But, you know, it should work. In theory.
L: (unscripted demo of iOS app)
H: Even with only a few months of training and having never programmed before starting ADA, writing code was never the hard part - it was working with data we couldn’t control and which had little/no documentation that were the biggest challenges. Not only that, but it became increasingly clear that there is never one straight path to a solution, in spite of how simple an idea might seem at the get go, which is how she fooled me into working on this. Basically, we didn’t know what we were getting into, and no one else did, either.
L: So because of all this ambiguity, we had to redefine what success meant. Part of this learning process involved the discovery that if you want to build something new, you aren’t necessarily going to be able to look up the solution as simply as clicking on the first stackoverflow result. While frustrating, it was also an empowering experience to see that other so-called “real developers” with real jobs don’t always know what they are doing and that someday we’re going to be those real developers who don’t always know what they are doing!
L: Originally, we wanted to make an app that transformed the way the Open Data was currently being used. This meant something interactive, human-friendly, and mobile. However, we ended up having to ‘settle’ for an MVP that wasn’t exactly what we envisioned due to the restrictions of the 4-week deadline. Which isn’t to say that our 4 week MVP wasn’t something to be proud of- on the contrary, it was a little bit terrifying that after just 5 months of learning to code could not only build something like this but also could see a vision and path to an even better product.
H: And even when we spend hours and hours on code we don’t end up using, we also realized that investigating potential solutions and figuring out why they may not work could also be counted as a win, and in some ways is just as productive as checking in production. This was actually a really hard lesson to learn-- the fact that all of us in ADA were making a sharp turn in our careers to go into tech made us feel like we had to “catch up” to where other junior developers with computer science backgrounds were. So anytime we ended up writing code that wasn’t used or ended up being the wrong approach seemed like we were wasting time or taking steps backwards somehow. But persisting through even when all roads seemed to lead to a dead end really deepened our mental and emotional stamina and gave us a better idea of how real-life code evolves.
L: So seriously, if anyone can help us wrangle this data more efficiently, that’d be great.
L: So, once again, I’m liz rush
H: and I’m Hsing-Hui Hsu
L: And we hope you enjoyed our talk! Thanks for coming!