Gumbug: A better way to browse real estate

Last summer I really wanted to find a decent rental apartment around London. Every day I scoured Gumtree, Rightmove and the likes in search of something affordable. In the end I decided to wait until I was able to buy an apartment instead, but I spent several weeks searching and getting annoyed at real estate sites nonetheless. I decided I could save myself a lot of time and effort by automating some of the steps of my search process. My search process went roughly like this:

  •  Go to Gumtree, search by location and price
  • Mentally filter out all the ads that I'd already rejected, usually because they were old or just looked crappy
  • Check the new ads, decide which ones I might be interested in based on my more subjective criteria (not ground floor, too far from public transport, high-crime area etc.)
  • Repeat the above process for a different set of locations
  • Repeat the above process for all locations on a different website (Rightmove, Zoopla etc.)
Thus Gumbug was born. Initally it was meant to search both Gumtree and Rightmove for rental apartments, but I've adapted it to only do Rightmove's To Buy section, for now. I've found a lot of duplication between sites that are listing property to sell, whereas for rental apartments there was often a whole category of quirky private listings that would only appear on Gumtree. The need to scrape multiple sites seems a lot less when only considering things to buy.

You can find Gumbug on github: https://github.com/rv/gumbug. I'm also running a semi-public version of it on Heroku, although it won't be very fast if a lot of people end up using it. You can have a play with it here: http://floating-forest-4090.herokuapp.com/, or to see some example search results, have a look at this link: http://floating-forest-4090.herokuapp.com/s/gzr1vwthsd. Since it might not handle the load, I'll describe how it works.

For each search you can add multiple sources, which are all consolidated into one page. I tried to avoid pagination of things as much as possible because I just want to see everything on one big page that I can scroll through at my leisure. If a listing appears on more than one source url it'll only appear once in the results. If the listing is already in the system its details won't be re-fetched every search, to save time. Adding urls as input might be a bit 'techy' but it saves a lot of coding time and allows me to specify a whole bunch of hard filters right at the source, since the url can already contain filters for price range, number of bedrooms etc.

Keywords Keywords

You can add a list of keywords to ignore and a list of keywords that are required. Eg. you can ignore 'ground floor, retirement' and you can require 'leasehold'. For the ignored keywords, if a listing contains at least one of the keywords, it'll be marked as ignored and moved to the bottom. For the required keywords, if an add doesn't contain at least one of the required keywords, it will also be marked as ignored and moved to the bottom.

Filter by distance to public transport Filter by distance to public transport

The public transport filter lets you select the stations you wish to be near to (or far away from). The list of stations is prepopulated from the zoned stations around London, but it'll automatically update after every search. If you add at least one station filter, all the listings will have to match at least one of your station filters, or else they will be ignored. Eg. if you add two filters: between 0.0-0.5 miles from Chesham station and between 0.2-1.0 miles from Amersham station, a listing must be either close to Chesham or close to Amersham (but not necessarily both) to match.

The distance filter is pretty stupid because distances are simply scraped from Rightmove, which (as far as I can tell) only shows straight-line distance. You might have to make a massive detour to get to the station, but Rightmove will still happily report that the listing is right next to the station.

Once the search is complete you get to see all the results on one page: all their images, important information and a map. No useless clicking through tiny thumbnails here. The key feature in the search results page is this: you can manually mark listings as either favorited or ignored, and any future searches you do from that particular search result page will preserve your favorites and ignored listings. So let's say you haven't searched anything for a week or so, all you have to do is press the search button to perform the exact same search again to get the new listings. Gumbug will pre-filter the new listings according to your criteria and will automatically move the ones you've already ignored manually down to the bottom.

So, why am I showing the ignored listings at all, if I'm clearly not interested in them? The reason for this is that humans (especially real-estate agents) make mistakes. They will mislabel things, forget to mention a keyword that every other ad that you're interested in has, or they'll add something stupid like "not ground floor" which throws off the keyword filters.

A second reason to display ignored listings is because you might be sharing the link to the search results with more than one person, and the other person might want to un-ignore a listing. Gumbug isn't exactly built on security: any person that you share the search results url with can favorite and ignore listings. This is great for me because I want to share search results with my girlfriend so she can go through them as well, but when sharing in public it's better to spawn a new search with a new url.

Lastly, there's the map. One of the things I've consistently found myself doing when checking listings, is to cross-reference the area with the deprivation map, which gives a rough indication of how much crime/poverty/incidents/bad things there are in an area. You can also click the name of each public transport station to display walking directions, so you know if that 0.6 miles is actually 0.6 miles (hint: it usually isn't).

Deprivation and Directions Deprivation and Directions

Gumbug will continue to be a work-in-progress, but it's reached a point where I'm quite able to use it to make my own life easier. Maybe it can help someone else too. Here's some of its issues:

  • When you flag something as ignored and then go to the next page, the ignored listing will pop up again because it's been moved to the back of the sort order.
  • No street view support yet
  • Some map issues when viewing on mobile
  • No floor plans yet
Feel free to give it a try on Heroku. If for some reason your search doesn't seem to be working then that might be because the worker process is not running. Since Heroku's not cheap I'm running the worker process on my local machine. Heroku's database is very tiny so it might fill up very quickly. If there's enough demand I could consider setting up a more proper version of it, so consider this an attempt to gauge the public interest. Let me know what you think :)

Posted in Tech , UK

Controlling foobar 2000 from Ubuntu with global hotkeys

Uh, long title, short explanation: I work primarily on an Ubuntu laptop but I listen to music from my Windows machine right next to it using foobar 2000, still the best mp3 player available (come at me bro!). On Windows I'm used to using foobar's global hotkey functionality to quickly pause and switch tracks, but on Ubuntu any way you try to pause or skip a track requires a context switch, which is damn annoying if you're in the programming zone. Here's how I solved it.

  • Get the foobar http control plugin: https://code.google.com/p/foo-httpcontrol/
  • Configure it to require a password just to be safe. Without it anyone can log in to your music player and mess up your playlist and what you're listening to. With the password on they can still do exactly that but they'll have to sniff the network packets, which really isn't worth it just to control a music player.
  • Once configured, use the python script below to remote-control your foobar from the shell.
  • Put a shell script in usr/bin (or usr/local/bin, I forget) that calls the python script with the appropriate parameter (PlayOrPause, StartNext, StartPrevious). For more commands you can check the javascript of the browser interface of the http control plugin.
  • In Ubuntu's hotkey configuration settings, add your hotkey and make it call the shell script you just created.
Here's the script:

import sys
import requests
from requests.auth import HTTPBasicAuth
requests.get("http://your-ip-address-goes-here:1234/default?cmd=%s" % sys.argv[1],
                 auth=HTTPBasicAuth('username', 'password'))

Voila! Cross-platform music hotkeys :D

Posted in Tech

Constructing a mind palace... in Minecraft

I absolutely love Minecraft. Though my level of obsession has dimmed a bit compared to when I was first mindblown, it's still an amazingly satisfying sandbox to play in. There always seems to be something new to build, which always manages to recapture my interest.

One of the things I noticed while playing Minecraft is that I pretty much know exactly what, where and how I built the things in my world. If I somehow lost my world and all of its backups, I am positive that I could recreate an extremely large portion, if not all of it, just from memory. The connection to a mind palace should now become evident.

In the past I've tried to build mind palaces of things, and have been more or less successful, up until the point where I try to populate the rooms in my mind with actually useful information. That's where my memory stops functioning well, I suspect because an entirely imaginary mind palace is just too unreal for me to hold in my mind. But if you tied a mind palace to something tangible (well, more or less) like a Minecraft world, a place with actual houses and paths and rooms, then perhaps it would be a lot easier to store knowledge in. If you go so far as to place things that you want to remember in signs and books, I bet you could remember a lot.

Another good example of a mind palace is my photo folder on my hard drive. I've organized it chronologically and hierarchically, first by year and then by month+day. While I can't remember exactly what happened on which day, using this folder structure as a mental guideline, I could tell you with reasonably high confidence what I was doing at any given month. But only for those months that I have photos of. My hobby of photography has waned a lot over the past years..

tl;dr: create a physical or virtual structure to hold your mind palace, then populate it with real-world information.

Posted in Tech , Thoughts | Tagged

As days go by

I haven't blogged in a while. Despite having switched from enjoying-life-mode back into grind-and-earn-money mode, I've managed to maintain a remarkable sense of self-actualization over the past few weeks. I think the reason for that is partly because I try to work less long days, as I mentioned in the previous post. I get time to recover and clear my mind at the end of the day, rather than never fully clearing it and piling up new workloads the next day without having fully processed the previous day.

Working less hours is part of the reason, but also a consequence of something else. My goals in life have become startlingly clear to me after I found out exactly how much money I need to buy a house in this bloody country. It'll take years and years of savings to fully pay off a nice house. Even if I found  a better paying job, the difference it would make will never be as significant as I want it to be. And even with a better paying job you're bound by obligations and forced to work for the better part of the year. Given that fact, I'd say I've got a pretty damn good job right now, and I see no reason to change it for something marginally better.

Financial independence is the final goal. It's not even worth thinking about what I'll do after I achieve it, because the possibilities will be endless. In the past I tried several times to 'do a startup', sometimes alone, sometimes with friends. But what I've come to realize is that the startup life is not something that I want for myself. I'm usually quite introverted, and although I learned that I can muster up the extroversion needed to function capably in a startup role, it's not something I enjoy doing or would feel comfortable with doing for a long period of time.

This is the point where people tell me "but to gain something you will have to step out of your comfort zone". Well, yes and no. Stepping too far out of your comfort zone is simply not sustainable and will wear you down. For me, I think I function at my best while 95% within my comfort zone, using the remaining 5% to explore new territories. I need to find things out for myself. Advice from others only helps at the most superficial level, any concrete advice will be noted only for reference while I make my own mistakes, from within that very comfortable 95% plan.

Realizing that I am more reluctant to leave my comfort zone than I previously though, I began to list my options. The list is limited, of course, compared to before, but the remaining options are those that I feel much more enthusiastic about than anything else. And because the options are 95% within my comfort zone, I get to expand my knowledge while actually enjoying it rather than feeling stressed out.

I don't believe that any advance in knowledge in the field of programming is going to help me to make progress as a human being. While it's true that I'm getting better at coding, especially within a project atmosphere, most of the things that I learned, that I value highly, are as a result of interactions with people. Focusing deeply on a topic will teach you two things: in-depth knowledge of the topic, and how to focus deeply. I think I've learned enough on how to focus deeply on something to apply it to things other than programming. Don't get me wrong, I still love to code. But I find that a lot of my peers see coding as the final goal, whereas whatever the thing is that they're coding is just a happy side effect. I want to use programming as a means to an end, whatever end that could be, even if it has nothing to do with coding or dev-ops or anything technical. I believe that if I can use programming in this way, I can become better as a person.

Posted in Daily Life , Tech , Thoughts

The law of diminishing returns

There's an ideal amount of time you can spend at work, working. In fact there's more than one ideal amount of time. In my case, I find that if I work for 6 hours and then go home, I still have enough mental energy left to work on personal projects after the commute. Working 8 hours is also good, although productivity does decrease a lot in the later hours. But it's better than working 7 hours, because in that case I find myself both mentally tired and not with enough time and mental energy to do stuff at home.

Posted in Daily Life , Tech

Finding an apartment around London

Ever since I started my 'break' period after my cycling trip I've spent 1 to 2 hours almost every day on Rightmove and Gumtree, looking for apartments. Rightmove has a much better offer and more decent-looking apartments, but Gumtree shows private offerings of landlords who want to circumvent agencies, which results in zero fees and a much more casual way of dealing with things. Depending on the landlord this can either be a good thing or a bad thing. In my experience in London, I've been bitten once by a bad estate agent, and been incredibly surprised at the niceness of my current landlord whom I found via Gumtree. Businesses or people, either one can screw you over if you pick a bad one, I guess.

I'm quite systematic about my search. I've got about 4-5 areas that I'd like to live in, and a tight maximum budget. This means my search always ends up pointing me towards apartments that have something wrong with them. Either they're ridiculously tiny, right next to a railway line, too far away from a railway line or in a shitty neighborhood. The Deprivation Map Explorer is an absolute must-have. I keep a spreadsheet of travel times and costs for the stations around which I center my search. Another good criteria that mustn't be left out is how long the walk to the nearest station is.

Recently, since I've been seeing a lot of apartments online, I've gotten a good grasp of what a decent apartment 'should' cost in each area, so I've noted that down in my spreadsheet as well. Whenever an apartment pops up that's way cheaper than the average, I usually know to check for what's wrong. But occasionally, just very occasionally, a jewel pops up: an apartment that's 100 pounds per month or more underpriced. I've seen two of them so far, and in both cases I was too late.

I've been thinking about writing a script that spiders Gumtree every x hours, in certain categories, looking for certain keywords. In fact, I'm very tempted to do this already, but I suspect it'll take a lot of fine-tuning to get usable results out of it. The things that can be wrong with an apartment can't always be easily spotted by a script. Hooking it up to the deprivation map explorer would be a must, and it would have to make heavy use of the Google Maps api to find the nearest station(s) and walking distance to each station. Most of the important criteria can already be filtered out in the url (area, max price, only ads with pictures, no agencies) so the core bit would be the scraping. Notifications can be sent out by email either immediately after the scrape, or consolidated every x hours or days.

I might give this a go if I'm still unsuccessful in finding an apartment next week. After all, there's always that golden rule of scripting: if you have to do it once, do it manually. If you have to do it twice, do it manually. But the third time, write a script.

Posted in Daily Life , Tech , UK

Reinventing the wheel for fun but no profit

Everyone in programming knows about not reinventing the wheel. Everyone I've ever worked with professionally knows to check if there are decent libraries or applications available that do what you want before deciding to build it yourself. Some things are so commonplace that you'd never think of reinventing it yourself.

Blogs, databases, wikis. Tons of implementations exist, and you'd have to spend some serious amount of time on it in order to make something better than what's already out there. Sure, you can pick one particular trait and improve on that, which will give you a good reason to release your work out in the wild as a viable alternative, but outside of that scenario, there's just very little reason to build something that's already been invented.

But that's ok! Nobody ever said that you're not allowed to have fun, and if your idea of fun is reinventing the wheel, then you should absolutely go for it. I for one have been having tons of fun writing my own wiki lately. It's fun because it forces me to think about how common problems with wikis can be solved. I usually think of a solution randomly, and then while coding it I notice that my solution ends up being slightly different from existing wikis. And then I realize that that's for a reason, and my design wouldn't have worked. Of course at other times I end up implementing the 'right' solution on the first try, and I get to put on my smug face.

There's another advantage of homegrown never-to-be-released designs, although I'd never flaunt this bit of wisdom professionally. You get to integrate your 'wiki' in incredibly nasty ways with whatever else you're building. After all, if you're building for fun, why bother making it a generic package. Releasing yet another wiki would be a crime against humanity anyway, so don't even think about it. Instead, just keep building and have fun. While you're having fun, you might stumble upon that one brilliant idea that no other wiki has yet, and you can start a new project which implements that idea cleanly. But the mind needs a playground, a free area where it can do what it wants. Reinventing the wheel is a great way to learn.

(Fun, like chocolate, should be used in moderation.)

Posted in Tech | Tagged

Deteriorating hard disks

An odd thing has been happening to my hard drives lately: they keep swapping IDs. My 2TB disk used to always be \Device\Harddisk1 but now it sometimes appears as \Device\Harddisk2 and another disk takes the place of Harddisk1. I always assumed that the number was determined by which port on the motherboards was being used, but it doesn't appear so.

My guess is that the disk has to be spun up before it can report itself to the BIOS. Assuming that spin-up time naturally increases in a disk's lifetime, both disks must now have a similar spin-up time, which makes it a race to see which disk can report itself to the BIOS first. Since nothing else changed in my setup, the only other theory I have is heat. My room has been very hot lately, and perhaps one brand of hard drive doesn't tolerate heat as easily as the other brand. Maybe. Perhaps.

Posted in Tech

Good customer service != good company

I used to get my mobile phone broadband from Giffgaff. They're a company with a nice fresh image and they offer unlimited data for 12GBP per month. They've always disallowed tethering, but until recently they didn't enforce it. I really need the tethering though, because my home internet is utter crap (thanks Virgin) due to a technical problem that my landlord has to fix, so I can't do anything about it myself.

So naturally, when Giffgaff started enforcing the no-tethering rule, I switched to Three. They offer a 15GBP/month contract that allows for unlimited tethering. Their website said (and still says) that coverage in my postcode is good, and when I first received my sim card it was fine. But not long thereafter I lost all signal inside the house. I notified Three about this and they promptly came back to me and let me know there was an issue with the cell phone mast, and then I had signal again. But not for long. It's been broken for three weeks now, and I've finally had a chance to contact Three about it.

The situation is pretty crap. Apparently there have been 'changes to the network' and they're not going to fix the signal in my house. They didn't say that outright of course, but that's how it is. They continue to claim that they have good signal in my area, but that's basically a lie. And instead of fixing it, they've offered to send me a box that I connect to my broadband connection which will give me good signal inside the house. Given the flakiness of my internet connection and the fact that it's actually my landlord's, not mine, I doubt that I can take this option. The other option they offered was a discount of the monthly rate, which I think is pretty damn good actually, at least until they've solved the network issue.

I've noticed this phenomenon a lot lately: companies mess up, but they're extremely quick to react to any mishap and offer compensation via polite and adequate service. The only problem is: it doesn't reduce the number of fuck-ups! I am no longer impressed by companies having good customer service; it's pretty much the standard these days. What impresses me is a company that doesn't fudge the facts about their signal coverage.

Posted in Daily Life , Tech | Tagged , ,