We Won an (Honorable Mention) Award
As I had mentioned in the previous post, we participated in a big data challenge called Data for Refugees which aims to use CDR data to make/improve policies that concerns Syrian refugees in Turkey. It was organized by Türk Telekom, Boğaziçi University, and TÜBİTAK, in collaboration with MIT Media Lab, Data-Pop Alliance, and others. The challenge had participants spanning 19 countries and from highly reputable universities such as MIT and University of Oxford. It also had organization/evaluation committee members from MIT, Data-Pop Alliance, etc. We won an honorable mention award of which I am very proud. Every once in a while, I find myself in a very lucky position, and I think participating in this challenge was one of those privileges. I am grateful to Assoc. Prof. Tuğba Taşkaya Temizel for inviting me to be a part of this.
Our competitition report was already published, but a longer and better version has just been published along with a select few among the participants. We also made our project website public, so you can look at some data through an interface. Here is a short news article from our university's press.
Writing My Thesis and Applying for PhD
I am currently working on my thesis. It had seemed like Türk Telekom was going to let me use their data that we used for the competition (this possibility was mentioned in the NDA), but as far as I know, they unfortunately did not let anyone use it. I tried to stick to my initial topic, but it became dull without the CDR data, so I found another topic. I am currently analyzing an online food ordering system in terms of restaurant, location, cuisines, menu, etc. I collect my own datasets and I think I have found a gap in the literature that I can fill (or at least initiate others into filling it). Although my datasets do not look as exciting as those giant CDR datasets, I am actually happier with this topic since online food ordering is a big part of my daily life and I want my work to be relatable, especially for myself. Besides, this topic is more related to online human behavior which is something I find fascinating.
For PhD, I have picked some schools in which I am interested. I tried to stay away from schools that require GRE (especially the ones that state a minimum score although ETS itself advises not to do so), but I might take GRE and see what I can do. I think GRE is overrated and it's lazy to use it to eliminate candidates. These standardized procedures cannot consider the applicants' diverse skills, and having a more diverse student profile is on the rise even in computer science. My application to my current program was probably not that strong, and we had a relatively long interview in which they really tried to understand my motivations and such. Yet here I am as a successful student, research assistant, and teaching assistant of some core courses. I wonder if people from schools that I apply would visit this blog.
I dropped my old "engineer's mind, designer's eye, artist's soul" tagline. Nowadays I do not meddle with art-focused stuff, and I think my professional title leaves a better (albeit incomplete) image.
A class recommendation tool for World of Warcraft Classic
World of Warcraft came to such a point that people prefer to play its old (15-year-old) version, and Blizzard decided to act upon that. They replicated that version under the name "World of Warcraft Classic" for people who prefer it old school. Choosing a class was always one of the most discussed topics of World of Warcraft, especially among new players. Since many people did not play the old version or do not remember much, I figured many people would ask this old question again. Class decision is somewhat complex, it depends on many different factors. So, I made a multivariate recommendation tool that considers a priority list along with some filters to suggest classes and specializations. It actually started as an experiment, but I decided to make it an actual web tool after seeing that it is doable. Users can decide on what is important for them, so the tool can make suggestions based on those factors. Moreover, I made it possible for visitors to give their own opinions by ranking classes/specs so that their data can be aggregated and used to refine the logic. I shared on websites like Reddit, it mostly got very positive responses. There were some displeased people, but I noticed they got unwanted results because they did not properly set their settings, perhaps I should add some additional explanations. Considering Reddit's general toxic atmosphere, I think it went better than expected. Here is the tool. All in all, I spent 1-2 weeks.
My (rogue?) location-based social media
As I said before, I created a social media application. For simplicity, let me just say it is for members of a Reddit-like website. This website is not exactly open for new memberships, and the members are anonymous. I noticed that members were trying to become friends (or more) with each other meanwhile anonymity is important for most of the members since many people post highly controversial stuff. Using this application, they can find other members similar to themselves (they can filter people by location, intention, age, gender, and personality similarity), fill their profile, and socialize without exposing their identities. They have to confirm their membership by pasting their custom security code inside one of their posts. The system checks the code and lets them register. The problem is people do not trust this system. Honestly, I thought it would not be a problem since they could just register to see if anything bad happens before disclosing much information, I guess I was wrong. Moreover, I knew that the members are mostly male, but I did not expect my userbase to be this much dominated by males. Considering a lot of private sexting-like stuff happens on the website, I find it weird that I could not attract female members. I will try to find a solution, but I already learned so much by making the application.
Scrapped or Postponed Projects
Throughout the year, I played with some ideas and early prototypes, but I abandoned them due to various reasons.
An R package to help data visualization
At some point, I was constantly assigning custom colors to stuff to make them distinguishable in plots. It is normally pretty easy to colorize groups, but that colorization completely changes once their order or number changes. I wanted to have a deterministic color assignment method that only takes the name of the label. So that "C" is always represented with a specific color (such as #f06464) for both c("A", "B", "C", "D") and c("C", "A", "D"). I developed its bare bones. It takes the label name, applies a hash, takes some specified bits, and maps them to an RBG spectrum while making sure the generated colors are vibrant yet different enough. I was making progress and I even implemented some automatic adaptation based on the given background color and/or vision impairment (such as different types of color blindness). Still, it was not great with many labels (an obvious problem) and I realized it needed a considerable amount of effort to progress further (hail Pareto). Besides, I mentioned this project to my thesis advisor and she did not seem impressed (although it seemed like a legitimate problem to me), so I dropped this project. Perhaps later...
A news summarizer
I hate clickbaits and news websites that desperately try to keep you clicking stuff. I made a prototype website that summarizes news. I wrote a news scraper in Python that I was planning to periodically and automatically run. That scraper would scrape, summarize, and upload the news to the database. The website written in PHP then displays the summaries. Firstly, I found a news dataset that has news articles, grouped by their categories. For each scraped news article, using only the news articles from the relevant category of that dataset and TF-IDF, I was able to pick the most important sentence(s) from the article that was scraped. Then, it was also picking the best keywords to search for a relevant image from sources that have CC0 licensed images. I wrote this mostly in an ad hoc fashion, but I think it was not bad for the amount of effort I put in. There were some interesting quirks. For example, a news article about Donald Trump had an image of Trump Tower since there was no CC0 photograph of Donald Trump, and Trump Tower was seemingly the most relevant thing the algorithm could find.
An .io game
I am interested in .io games, and I had a retro style multiplayer game idea in my mind. The networking side and the logic behind was somewhat clear in my mind and I was going to use Socket.IO, but writing the whole game, implementing latency compensation, creating graphics, creating SFX/music, and dealing with rendering (I was planning to use PixiJS) would take so many months. Besides, having a stable player traffic is especially hard in the beginning, which would definitely make players wait for other people to play. That means I also needed to have some kind of AI (more like complex rulesets) so that people could play with bots, which requires pathfinding algorithms and stuff that requires a lot of effort to have player-like bots. I think this kind of projects are great to showcase my skillset, but I cannot spend that much time on a project right now.
Every once in a while I get inspired to paint. I have been working on a piece, but I am just not that good and I cannot achieve what I have in my mind. Online tutorials can only help so much, there isn't anyone successful here that teaches digital painting (another reason to go abroad), Skype sessions with foreign teachers are too expensive considering the exchange rates, and I am not skilled enough to learn on my own. At some point, you become able to learn on your own, but in the beginning, you just need someone to personally show your errors and the right way. I made some progress, but it is not enough. Besides, I do not practice much. I do not stop and watch trees, clouds, grass, etc. How can I paint one without spending time on "seeing" it? By the way, I love how people always say "just watch some tutorials, there are many sources" and such. I am a pretty good learner and things don't work like that. If it were true, no one would go to university. After all, there are many resources on the Internet, right? Universities are not here only because they provide some credibility. Learning and education are complex subjects, you cannot just throw gigabytes of information (especially when most of the information is tailored towards beginners) to someone and expect them to actually learn it without a sound basis. For example, right now learning a new language and creating a project is fairly easy for me, because I have prior experience in the domain and I know how to "learn" something in this domain.
Building vs. Mining as a Career
I have been trying to decide on what I wanted to become. I love building stuff, but I am not sure if I would like software engineering in a professional environment. The job seems too much focused on certain technologies and frameworks. It practically makes sense that they want people who can already use the tools they use, but I think software engineering should be less about the tools and more about the mentality. Meanwhile, data science has fewer but established tools that everyone uses. Moreover, it seems like the process is more flexible and result-oriented. I feel like I could juggle between Python and R, and people would not care that much, which makes the process flexible and not forced. For my thesis, I currently use Python, R, and Java together since they all have their own advantages/libraries and I have my own habits/preferences. After a long time (I had started seriously programming with Java), I started using Java again and I realized I really don't like Java. Yes, it is much faster, but I just love duck typing and dynamic typing. Everything feels intuitive with Python. By the way, I tried to move my pipeline to Spyder, but I think I will stay with R for now. Spyder's user interface feels underdeveloped (like IDLE). If I really needed to use Python for scientific computing, I would probably prefer to use PyCharm (with certain modifications to make it more suitable for scientific computing).
A few months ago, John Maeda admitted that "design is not that important." I really like him and I think his contribution to creative coding is critical. (Assistant Professor Andreas Treske had gifted me one of his books, Creative Code, and it was one of the best book gifts I have ever gotten.) However, as someone coming from the design side, I am surprised it took him so long to see it. Graphic design is a bit overrated nowadays. I am not saying it's not important, but form should follow function, this is one of the first principles taught in design schools (at least the ones that follow Bauhaus).The tradition of feeding your thesis jury always seemed so weird and unethical to me. I think I should not be responsible with feeding them. If someone should bring some refreshments, it is the school who should do so. More importantly, it looks like bribery. It actually makes me more uncomfortable when I try to empathize with the jury. Maybe I am being too strict, but If I were a jury member, I would strictly refuse to eat or drink any of those. It does not matter how small this favor is. A favor accepted by the authority can affect the authority's decisions (unless they think that it is not a favor but a responsibility, which is even more frightening), which is an unnecessary complication and risk. It is also not very clear where the line should be drawn. I have seen some students making their mothers prepare at least 10-15 different types of snacks/foods for the jury (and be present to serve them and other bystanders), enough to feed like 20 people. What if the next time someone decides to take the jury out to Las Vegas and feed them there? This may be an exaggeration, but I cannot see too much difference. This tradition contradicts with the ideals and mentality of academia. Therefore, I congratulate UCLA Psychology Department for banning this nonsense. However, since I am not the jury and I just want to get over with it, I think I will just conform and bring some refreshments.