Wednesday, May 14, 2014

Final Challenge

This is it. If everything goes according to plan, I will have finished all of undergraduate work before I go to sleep tonight. After this, I guess the "real" world. Though something tells me I've been in the "real" world all along.

I am pretty excited to move on, start new things, get some sleep. It has been a rough ride to get here. Only a few hours left to go.

Final Tarball

I just came from the CS computer lab where we put together our final tarball to send to the TA. It was a good time. James brought donut holes and we sat around and talked and laughed while we compiled all the files together.

I am happy with our team, and the way things worked out.

Final Presentation

So we placed second in the class. I have to honestly say that I am very surprised. It is not that I didn't think that our project was good, but I thought the other projects were amazing. Then again, those teams said similar things to me too.

It was a good feeling to see the work you've spent so much time on appreciated, and to have everything come together. I think we are all pretty happy with how it turned out.

Sunday, May 11, 2014

Attire

We are currently in the process of preparing for our final presentations on Tuesday, and a seemingly trivial question is leaving us stumped, "What to wear?". We all don't really have any idea how to dress for these things. We know that how we present ourselves can often play a major role in how well our product is received, but we don't know what attire matches our goals. Our initial thought was of course "suits", but not everyone in the group owns a suit. On the other hand, the Google guys usually wear cargo shorts and t-shirt, the natural look of a young hacker these days. Is that what business people want to see?

AWS Non-Free Tier

During the process of deploying our application for on AWS, I ran into an interesting issue that we had yet to encounter during the entire development process, memory issues.

It turns out that the free tier of AWS gives you only 512M of memory. This is usually fine, unless you are using a memory hungry language like java (actually any garbage collected language will probably suffer performance issues with this little memory, there was a recent paper published on this. As it turns out the C people are right about garbage collection with respect to performance, however they are still wrong about a human being able to track garbage collection safely.). The last time we deployed the application, we didn't have any issues, but since then there have been some system updates, our code has changed, etc. We must have been very close to the limit last time, because the crash actually takes place on code compilation not runtime.

Anyway, to remedy this I had to upgrade our AWS server to the non-free instance. Thankfully this is only $0.044 per hour, so it won't break the bank.

Final Deployment

So, this weekend we have been going through all the issues associated with actually deploying a production ready version of the application. This isn't the first time we have put the site up live, but this time we made extra sure that all the web related features we wanted worked correctly. Which, of course, they didn't.

By "web related features" I don't actually mean anything related to the code of the application, but rather the webserver, https, DNS, etc.

It is easy to see how so many servers can be misconfigured so easily, with the wealth of options and settings that one must tweak in order to get the behavior you desire.

Sunday, May 4, 2014

Presentation Participation

I am uncertain how I will be able to actively participate in the final presentation of our project. For the first two trial presentations, I didn't say anything except my name. This isn't because I was not involved or didn't know what was going on, quite the contrary. However, we only have ten minutes and are already just about using up that time, and other people strongly wanted to be speakers.

In general, I don't really care too much about speaking, I just want things to go well. At the same time, being the only silent observer (Sonny drives the demo, so he isn't "observing") makes me look like the tag along member of the team.

But I keep coming back to the issue of time. I don't wish to "steal" parts from my teammates, and I certainly could add more to the content being presented, but where do we find the extra time.

The idea that we are currently floating, is to have me perhaps do the conclusion. But with the way things have gone, that is still basically nothing. Something like, "And that's Mechanapp, revolutionizing expert system based auto repair!". I mean, it will actually probably be at least three sentences, but it won't be much.

Hmm... what to do...

The Final Merge

Currently our project has switched in to a mostly test mode. Almost all of the code that we are writing is just tests and bug fixes. We do have a minor issue that we hope to resolve tomorrow, two separate branches that need merging.

We diverged about a week ago to clean up the last two tasks in the application, and we have not had a chance to bring them back together yet. I don't anticipate any issues with the merge, but it is rather discomforting to not have it handled yet.

After this merge, I will sleep better. (Well, probably not, but one can hope...)

Pair Programming

This past week, Alan and I had a chance to implement the last of our major features together, the authentication system. It was the first time that we had done pair programming on this project and it was a really good experience.
I have not often had the chance, or taken advantage of the opportunity, to pair program in the past. I am somewhat skeptical of what appear to be "fad" or "brogrammer" programming techniques, however this seemed like it had a lot of promise.
I hope that I will get the chance to do it again in the future.

Sunday, April 27, 2014

On Teamwork

Group projects are inherently difficult. Many different people working on the same thing, with different skill levels, backgrounds, schedules, etc. It is hard. Often there is this idea that everyone must do an equal share of the load. I think that idea is silly and completely non-realistic. I have been on many group projects, in academia and otherwise, and I have never experienced people equally sharing the load. However, this does not upset me, rather I think it is the only group projects can in reality function.

When you start a group project, some people are going to have a higher skill level in the various areas of interest. This is natural, and even good. The best thing to do is then to divide those people up, based on their skills, and have each person work where he can do the most effective job. However, you will inevitably end up with multiple people working on the same task, with different skill levels. Expecting the less skilled person to do more or equal to than the more skilled person is insanity. Why would anyone expect that outcome? But if one is to take issue with this, then you will find yourself always dissatisfied with group performance. Rather, the metric by which you should be evaluating your compatriots should be their ability to perform the tasks they say they can, in the time frame they give. Perhaps you could have done it twice as fast, but that is not an issue as long as you are still working on other tasks than more work is still being done. The less skilled person, by performing the task becomes more skilled in the process, which helps the group as a whole.

To be more concrete, our project revolves heavily around web application architecture. This is an area that I am very familiar with, so I have more skill and understanding in the are than some of my teammates. As such, I have performed more work than some of my teammates. But I in no way feel that this means that they did not carry their share of the load. They did want they could, when they said they would, and I have been very satisfied. If our project had been focused on a different technology, say robotics, I would likely have been doing the least quantities of work rather than possibly the most, just because others would have had a better background in that area.

Libraries

Modern applications are so complex and do so many things, that all but the largest of development efforts rely heavily on open source libraries to bootstrap themselves into a real working application. We are doing this in many different ways in our application. This includes grand large open source libraries, such as our framework, to smaller and perhaps more scary libraries like a google+ auth plugin (Not written by Google).

As the recent Heartbleed bug has show us, there are dangers to using such tools, but what else can a developer do? It would be some form of madness to constantly reinvent the wheel. Further code you write yourself is no less likely to not have bugs just as damning (often more likely).

How does one vet these libraries before including them in your application?

On "Real" World Features

Introduction

Our application is nearing it's final completion stages for the semester. Throughout the course of the semester there has been pressure for us to add "real" world features. By "real" world features, I specifically mean, domain name, web accessible (rather than running on localhost), google+ and facebook integrated login, etc. While I understand the importance of these features in a real world application, which this may become, I feel that having them be part of this iteration of the application is a very bad idea.
These features are all associated with a few common issues, cost, personal identification, security (which in some ways are all sides to the same coin, a three faced coin...). Because of this I don't they should have been requirements or requested for this application. We included them largely without objection for fear of being evaluated more negatively (from an academic perspective) if we did not have them. The mechanisms that drove the evaluation process are so opaque to us, that we felt it better to just fall in line.
However, I do think that it is appropriate to address the issues with this features.

Domain Name

Problem

Registering for a domain name on the internet requires that one disclose a substantial amount of personal information to a third party. This is often not a big deal for a company, which already has all of its information listed publicly, but is not something that a student desires to do. Further it costs money.

Alternative that works just as well

At the end of the semester we will be doing our final pitches for our applications. At this point in a real world development cycle, many applications would still not be on the web but rather running locally. This would be intentional in order to keep the not yet launched application underwraps. After a backer was found you may put it online. Certainly this is a debatable point, perhaps some applications would have the site up in development and even before it was backed by a venture capitalist, that's true. But we are students, and it is perfectly reasonable to have our application not sit on the real world wide web, or at least, it doesn't need a domain name until a real launch.

WWW Accessible

Problem

Making the application accessible on the world wide web, with real hosting, costs money, opens up the system we use to the very real perils of being a server on the world wide web, and if we use a hosting service again requires us to divulge personal information.

Alternative that works just as well

Run it locally. Just like for a domain name. Maybe an argument could be made for having it on the world wide web for the final pitch, maybe, but definitely not during development.

Google+ and Facebook Login

Problem

In our modern world, your social media/email accounts are very important, and a very high target for hackers. If we implement the login improperly (which happens all the time on the biggest of web applications) then we expose real people to the real problems of getting their accounts hacked.

Alternative that works just as well

If our application must be web facing, then there is not alternative. Implementing our own login features is more likely to have security issues. But, as previously mentioned, the application does not need to be internet facing.

Summary

So this is my little rant on the features that I do not think are necessary or even good for our application. I am not too upset about them, but thought it might be something to consider for the future. I understand why they are desired, they bring a sense of reality to the projects. At the same time, while we pretend to be in the real world this semester, we should not forget that we are actually still students.

Sunday, April 20, 2014

How to manage then unmanagable

So this past week has been one of the busiest of my life. I have slept 15 hours in the past 4 days. I am not writing about this to garner a pity party, but rather to ask a question. What do you (whoever you are) do to stay productive when it seems like there is not enough time?

From the studies I have read, sleeping is always better than not sleeping, even when you are busy. But I don't know if the people who wrote these studies were computer scientist students also working 20+ hour a week jobs.

So when the rubber meets the road, I don't sleep and eat like crap. And I usually get everything done. What do you do when there isn't enough time?

The Increasing Cost of New Features

As our project has become larger and larger I have noticed a much greater cost for adding new features into the codebase. This is almost always the case for any non-trivial project, however this cost seems particularly painful right now, perhaps due to the busyness of my life in other areas as well.

The latest feature that I have been working on adding to our application has taken much more time than I had thought to implement.

This is also one of my largest flaws, estimating how long it will take me to complete a task.

New Features

In the past few weeks, I have had a much greater desire to add new features to our application. Now that things are mostly together and I can see the product for what it is, it seems like it is hard not to see it for what else it could be. This is not to say that I am unhappy with our project I think we hit our goal and things came together better than I had expected.

At the same time, there are several things that I really want to add to our application. The rest of my team does thinks they would be great, but doesn't feel that they have time to add them. I have to say I rather agree, but it still irks me nonetheless. How does one "finish" a project?

Sunday, April 13, 2014

Tedious

As we get closer to the end of the project, more and more of the features that we wish to implement require less and less interesting work and more tedious work. This is not to say that the features themselves are boring, they are certainly not, they are quite cool to have in the application. However web applications involve so many moving parts, that in order to do them securely and correctly, you have to do a lot of boiler plate.

I find myself getting excited to work on a new feature, only to quickly lose interest as I realize all of the small indirectly related sub tasks that are much less interesting.

I wonder how one learns to make these small tasks more enjoyable, or if that is even possible?

Presentation Thoughts

This week my group had to do a mock presentation of our application to the rest of the class. It was a rather strange experience, in that there was not really enough time for the four of us to each speak in a fluid manner, leaving a single speaker and three generally silent members.

It felt strange to me, to be a silent figure at the front of the class. The role seems to imply some form of direct interaction upon my part with the presentation, although really there wasn't any. There was plenty of indirect interaction, through showing the body language of approval and interest in what my team member was talking about, but that was about it.

I was not dissatisfied with my team members presentation, on the contrary, I thought that it was very well done. It only felt as though there should have been more time, or I should have not been standing there. This isn't really a critique against the professor, because I am well aware that this situation occurs all the time. I just wish there was a more organic way to deal with this type of presentation.

Small Changes

Currently I am working on adding some new features to Mechanapp to allow for user login and persistent user session. Both of these items are in and of themselves not very complex or difficult to implement. However, implementing them in a good, safe, reusable manner is another story entirely.

User login, demands that we are sending user credentials back and forth on the network. This means that we must now start encrypting the traffic with TLS, as well as purchase a valid TLS cert (if we want anyone to actually use this application). Further both login and persistent user session (by which I mean tying state to a user id rather than a cookie), require that we introduce a second database that will be much more mutable than our previous one. Until now our architecture had remained pretty simple, but at this point it is getting more complex.

It is strange how such small features can have such a large ripple effect in the application.

Sunday, April 6, 2014

Formal Attire

Tomorrow we have our first demo presentation of our software. There has always been an age old question about these types of events, what should one wear? I know this sounds silly, but hear me out. Often at the large CS conferences, the main speakers will wear t-shirts and jeans, the formal attire of the programmer. Is this the way to go? Or are these people just being lazy? I often think that people who wore a suite to such events would either have to be James Bond cool or they would not be taken seriously. In our final meeting we will be presenting to business people, which lends itself to a more classically formal attire, but what do you wear for other CS people in a formal engagement?

Meetings

Over this semester I have had a very interesting experience with meetings in my group. Generally our group has met at least once a week often more than that, and observing these meetings has been strange. Often it seems that we are get very little done during the course of the meeting. We discuss some high level topics and that is about it. It seems that most of the real discussion actually takes place in email chains and the like. This is not to say that the meetings are bad, on the contrary they are perhaps the some of the most enjoyable times during my week. It seems that our group has the fortunate (or unfortunate) issue of getting along too well. We get along so well during our meetings that is is hard to keep them from becoming just hang out sessions.

Reflecting upon this, I am not actually sure it is a bad thing. Certainly we have somewhat of a hard time accomplishing the tasks we wish to accomplish during the meeting, but they do get done shortly thereafter in emails and other forms of communications. Also I feel like the group bonding time is really good for our cohesiveness as a team.

Running out of Time

So at this point we are pretty late in the game. There are only about 6 weeks left in this semester, and that is really not a lot of time. This puts us in a awkward position with our project, and by us, I mean me. There are several large new features that I would like to implement before the semester is done, but I don't know if we will have time to get them in place. I really feel that they are killer features, but with so little time, I feel that I don't want to lobby for help from other team members until I am at least sure I can get them off the ground.

This puts me in a very sticky situation. Work on new features that may never see the light of day, or focus on polish of the existing codebase?

Saturday, March 29, 2014

IDEs and Emacs

So I am an emacs guy. I love emacs. I want to use emacs for everything I code. Unfortunately, emacs does not have great support for Java, which we are using for our project. In the past I have used Eclipse for Java, but this time through I thought I would try Intellij IDEA.

Intellij is a very good IDE, but it doesn't really understand the web framework we are using with our application, and so it tells us that our code is broken, even though it isn't. So for the first time ever, I decided to attempt to code a large Java project in emacs. It has been an interesting experience.

I find that I understand the application much better writing it in emacs. I have to manage all the imports myself and as such I have a much clearer understand about how the dependencies work. On the other hand, I definitely code Java slower in emacs.

I am still not sure which way I prefer for coding Java. I mean emacs Java features are terrible, but emacs general features are awesome.

Perhaps I will make another post later in the semester after I have made a more formal evaluation of the two in comparison to each other.

Roles

As we are moving into a new phase of development in our application, I find that our development roles are changing. As we started out work on this application, there was a clear division on labor. We each had our own part of the codebase that we owned.

Now that things are farther along, and many core features are nearing a complete state, we are having to shift roles working more intimately on each other codes.

This is an interesting transition for us and it is revealing a lot of issues we hadn't noticed before. For instance, until recently I only had a passing understanding how our database setup worked. Before now, I didn't need to know how it worked, it wasn't my code, I just used the API that my teammate made. Recently I had to work more directly with the DB code. I found myself spending quite a bit of time just working through the code before even getting to the issue, even though it had been in the application for a long time.

I think all of us are having this issue. It will be a little bit of a set back as we have to learn to work with code that we didn't right, but probably a good experience.

Polish

As of a few days ago our application is in a really nice state. Most of the features that we needed to implement are implemented. This really feels like a good place to be in. Now our development will probably shift to more candy features. Accounts, better GUI, more useful user feedback, etc.

I am finding it very hard, as we move into this set of feature development to stay really excited about these features. I find myself wishing to develop more large sweeping features, rather than these small ones. It is difficult to remind myself that it is these small features that will actually make the application successful. The devil is in the details, as the phrase goes.

It is a strange realization, that the use of a particular tool is not generally a function of its usefulness, but rather its polish. I mean, it still has to work, but it seems what really makes an application successful are these candy features.

Saturday, March 15, 2014

Full working stack

As of a few days ago my team has assembled a full working stack for our project. This felt like a very good milestone for us. Prior to this we had a lot of the pieces together, but there was always some fear that assembling them would cause the entire thing to fall apart.

While this feels like a great victory for us, it is also quite scary. I feel like we are almost finished with the product, and yet, there is still so much time left in the semester. More so, there is still a lot of important pieces we need to put into place, particularly, we need to start building our data set to drive our application. While the core program is in a mostly functional alpha state, it is useless without data to drive it.

I guess this is the "Sag" that was discussed in class. I know I am starting to feel what that word means, in more than one way. So much done, but still much more to do.

Branching issues

As our project has matured somewhat we have run into some interesting issues with our VCS. We are using git for our VCS and it is great! One of it's killer features is the ability to create a lightweight branch very easily. This is, in my experience, much easier than the same process in SVN. This feature has proved to be very useful to us, as each of us can work on our own code with out mucking up the others, and still share it to the master repo so that others can pull it when they wish. However, we have used it so much that we have begun to experience branch cruft.

We now have many branches, many of which are technically dead, but it can be hard to tell which branch has the features you wish to pull. You can of course, examine the commit log for each branch, to see which one has been active recently, but this process seems very tedious.

The solution for this problem was trivial, just add a prefix to each branch for each week, so that we know which are active. However we only learned this simple technique after a week of dealing with many branches of unknown activity.

Bug Tracking Etiquette

As our project has grown from to a decent size, we have ended up making some minor mistakes and creating small bugs here and there. An issue which we are having to learn how to deal with is figuring out how to track all these bugs so that they don't get lost in the shuffle. The solution is simple of course, a bug tracker. Thankfully our repo management has a built in bug tracker that we have started to use.

More interestingly is the etiquette for bugs in code. Recently I discovered a trivial bug in a section of code that a teammate was responsible for creating. I thought that I understood his code well enough to make the change myself, but I hesitated on actually implementing the change. It occurred to me that changing another persons code, when that person is active on the project could be considered some how inappropriate. Instead of fixing the issue, I could just have easily opened a ticket in our bug tracking system for him to fix it, but then I must wait for him to notice the ticket, and take time to fix the problem. Neither of these alternatives is inherently better than the other, but it is an interesting dynamic that I had not encountered before.

I supposed the right answer on how to manage this stuff is very project and person dependent. It is surprising how much peoples interactions with each other and feelings play into software development. It is much more than just writing code in emacs.

Saturday, March 8, 2014

On remembering to not be stupid

This post isn't a joke. I very frequently read scientific studies or articles or papers about the various things that people do that actually prevent them from being as happy and productive as they would like to be. Many of the things that I read about I actually do myself. Nevertheless, although I have amassed this great amount of knowledge about how to be productive and more happy, I put very little of it into effect.

For instance, I am by nature have a higher than average propensity for anxiety. Caffeine is highly correlated with an increase in anxiety (among several other negative things), and yet I still drink coffee several times a day. Sleep is another big one. Getting the correct amount of sleep is highly correlated with increase performance everywhere all the time in everything and yet I frequently get less than six hours of sleep. Why do I do this? Because even though I know this it feels like it isn't true. It feels like coffee helps me, rather than hurts me. It feels like working through the night helps me, rather than hurts me. So I need to remember to not listen to my feelings and not be stupid. This is especially important during very difficult school semesters.

So with that note, off to bed.

WTFM

RTFM or "Read The Fucking Manual" (http://en.wikipedia.org/wiki/RTFM) is a amusing, if vulgar, phrase that arose from people not reading the documentation for how something worked, and the complaining when it did not work or when they did not understand it.

Last week I spent literally twelve hours pouring over the code for a library that I was hoping to integrate with some code I was working. The library was very poorly documented, with some of the vague documentation contradicting other parts of the documentation. This reminded me of an alternative phrase that has popped up in recent times, WTFM or "Write the Fucking Manual".

With programming becoming a more and more common practice and more and more emphasis being placed into writing code very quickly (a perversion of Agile techniques), documentation of code is going be the wayside. This is very bad. Why is it bad? Two reasons, danger and waste.

It is dangerous to not write documentation because the other hot topic these days, is software security. When one fails to properly document a codebase the people who are working with the code base (either in library form or other developers working on a shared project) are much more liable to make errors that could introduce security bugs in the code base.

It is wasteful because as I already mentioned, I spent twelve hours reading the documentation for a library. The library was complex, but the actual API wasn't supposed to be complex, and I should have been able to adapt it to my code in a couple hours. Unfortunately there was some subtle strange behaviour that took my a long time to work out because of the terrible documentation.

This is a waste. That time I spent could have been used better, and many many people suffer from lack of good documentation on a daily basis. It is a tragic affair. So WTFM.

Technology Choices

This last week we were supposed to discuss our technology choices in our client meetings. In our meeting, we were short on time and didn't get to discuss them in as great detail as I would have liked, so I thought I talk about that a little bit here. I was in charge of researching web frameworks for our team, so that is what I will discuss.

We are creating a web based application, and for that we are using a web framework. I was considering for our team both Django and Play, one based on Python the latter based on Java/Scala. I prototyped a few toys in each of them, and decided on using Play in the end.

As I mentioned in a previous blog post, they are both very similar. Their overlap in feature set is probably close to 90%. So it was rather hard to make the decision.

In the end we decided on Play, mostly because we could write Java for it, which we are all very good at. It also was somewhat easier to create the types of web applications we are targeting with Play, while being easier to do other types of applications with DJango.

This was a bitter sweet choice for me. I really like Play. It is designed to be highly scalable and performant and (after reading some of the docs) is a joy to use. On the other hand, while being very familiar with Java I really don't care too much for Java. This is mostly a personal preference issue, as you can certainly create very well designed nice applications in Java (and just about anything else). On the other other hand, Play supports Scala, which I am quickly becoming very interested in, as I have developed a fondness for functional programming paradigms recently.

I am not sure that I will write any Scala in this class, but I can definitely see myself coming back to use Play with Scala in the future.

Saturday, March 1, 2014

On Portablity of Modern Code

Recently I have noticed that there are a lot of programming languages targeting the JVM. Aside for Java, there is Clojure, Groovy, Scala, JRuby, Jython, and Rhino, as well as a handful of other less notable languages.

It is interesting how all of these languages flocked to the JVM to bootstrap themselves into all of the work that has been done on a powerful bytecode runtime. Without doing any extra work they get platform portability and runtime optimization (maybe, I am not totally sure about this dark magic.).

It is interesting to ponder what might happen if Oracle stopped developing the JVM, or closed source future updates to it. All of these separate entities would have to come together to maintain the runtime, since they are essentially dependent upon it.

Even more interesting is the fact that if a programming language isn't running on the JVM, it is pretty likely it is a scripting language. Ruby, Python, PHP, JavaScript all require no compilation to run. It seems that the ideals of creating portable code have now been met with either the JVM or a script.

On web frameworks

Over the past few days I have been working on selecting the web backend for our project. Last client meeting I had narrowed it down to django, a python based framework that is popular right now, and play, a Java/Scala based framework that is also becoming quite popular.

Evaluating which framework to use has been quite a challenge. They all seem to support all the features we care about. They are all expressive and "easy" to use. They all seem great.

As a team, we have a stronger background in Java, but we are also Computer Scientists and should be able to pick up a new (newish as we have some prior experience with Python) language in just a few days.

But web frameworks are much more that a library for a given language, they are a meta language unto themselves. The web has evolved in such a way that when we "surf the internet" we are using a number of technologies at once, many not designed to work with each other, connected together in strange an opaque ways. Web frameworks allow the programmer to once again be a programmer, rather than a Rube Goldberg machine master.

Further complicating the issue is the fact that are so many good web frameworks. The only way to truly compare one framework to another is to site down and write an app in both frameworks. Alas, this means learning two frameworks (or more), which is not a task that the busy student enjoys. But it must be done.

Are build systems bad?

The other day a friend and I were attempting to compile some Java code which we did not write ourselves. While we both are very experienced with the Java, neither of us had written or built any Java code in quite some time. Further we did not have Eclipse installed on the system we were working with, so we went about the daunting task of building the java program from the command line.

It took us a good ten minutes at least to get the code to finally compile. The experience reminded me just how frustrating it is to work with the Java build system. At one point in time, I knew the system in and out. I learned all the nuances of it just long enough to write a bash script to do it all for me.

When I come across issues like this I often think about how much some tools separate the programmer from what is actually happening with his code. Before I wrote the aforementioned bash script, if you had asked me to compile one of my Java projects without eclipse I would have had a very hard time, and I wasn't a novice Java programmer.

Java and Eclipse aren't the only examples of this, many many programming languages have complex build systems, and because we often use IDEs that build our software for us, we never actually understand how it works.

I can't decide if this is a good thing or a bad thing. Certainly automated building of software is good. But when I build software in C I write a makefile which runs commands that I understand (although there is some magic in the inference rules of POSIX make that I think most people gloss over). When I build Java in eclipse I am totally separated from the process.

Does the modern programmer need to know or care about this stuff? Or should he just spend his time on more interesting problems?

Saturday, February 22, 2014

Testing Saves Time

Today while working on an Arraylist implementation in C and find myself working hard to find a very strange bug. As of now I still haven't found the cause, and will probably be working on it for some time more tonight.

I only found this bug because I decided to make heavy use of C assertions. This reminds me of just how important testing is the process of developing good software. Had I omitted these assertions, the code would have seemed to work correctly, only given me error ridden results, which usually take much longer to debug.

It is strange that there is so little desire to test first among programmers. Lack of testing invariably leads to long painful bouts of working through seemingly correct, but functionally wrong, code. Hopefully the ideas of software correctness and testing will take root more firmly in the future.

Iterators in C

Recently I have found myself implementing an Arraylist data structure in C. As I did this I contemplated how I might implement various standard OO mechanisms in C in support of this Arraylist. The most obvious mechanism to support for this data structure is the classic Iterator pattern. So I thought I talk about how iterators work in general and how they might work in C.

Iterators in supply an interface for traversing a Aggregate data structure, that is something that encapsulates 1 or more elements, in a way that decouples them from the client code. Most iterators provide a mechanism for at least four central operations, reset(), next(), isDone(), and getItem(). These functions allow for resetting the traversal to the beginning of the Collection, moving the iterator to the next element in the Collection, check to see if there are any more elements after the current element, and getting the element at the current location of iteration, respectively.

Iterators are very powerful as they provide a way to define different types of traversal over the same Collection, ways to store traversals for later, as well as many other neat tricks.

Usually when implementing an iterator in a standard OO language, you have the given Collection create some Iterator object that conforms to some Abstract Iterator interface (Or actual Interface rather than Abstract class in languages Java). In C we are not afford such luxuries as Objects.
What we could do is define the struct and members in a header file somewhere. We would set all of their values to be Function pointers. Any collection that wishes to provide an Iterator can create a struct of this type and set the pointers in it to point to static functions that are implemented internal to the Collection.

There are quite a few issues with this method however, for one it explicitly violates the intention of a static member by allowing references to them to leave the scope in which they were defined. But this is kind of how C works, Wild West Style.

I haven't finished all the implementation yet, so I can't speak to how practical this would be (I am sure that Google can tell me, but sometimes it is useful just to contemplate something.). It is an interesting task forcing OO concepts into non-OO languages.

Proposal Afterthoughts

Now that proposals have been picked and teams have been chosen, I have taken some time to reflect on the process as a whole. My proposal was not picked, which is okay. Having spent some time reflecting on Professor Ackley's feedback, it seems as though my idea may have been too much for this semester. I still believe that it is doable, inevitable even, but not in this context with this amount of time.

This is kind of how I felt about most of my ideas, that they wouldn't be feasible in eleven weeks. I had a few ideas that I liked a lot better than both of the ideas I ended up proposing, but felt that it was not realistic for us to complete them in such a short amount of time. In retrospect, I probably should have ignored this notion and proposed them anyway. I felt that if I wasn't sure I could bring an idea to what I considered a complete version in the given time frame, I should not propose it at all. But I failed to realized that even a partially functional prototype may be worthwhile, or maybe not. That is something that is a bit scary about this course, I don't now how much is enough for anything. I am not really complaining about that, in fact I think that it is structured that way intentionally, nevertheless it is still a bit nerve racking.

On another note, I felt like I learned a lot about pitching a project through this process. While I was generally satisfied with my presentation style, I didn't get any traction from my peers on my project. I think that the problem was two fold, my proposal sounded like it would be rather difficult, and I failed to go through enough refinement on the project itself. Having had more time to think about my proposal, I think I would change a lot of things about how I described the implementation. I mean, this is how I am about everything though. As soon as I get done with a given section of code I am ready to declare it a terrible failure and start again (Regardless of the fact that the code actually functions fine). Nevertheless, I think in this case my proposal could have been truly helped by some more thought and refinement. A lesson for next time I guess. As far as it being a difficult, or rather time consuming, task, I don't think there is much to be done about that.

I supposed I feel a little bit disappointed that my projects didn't get great responses from my peers, but I think I have learned from the experience and will be able to do better next time.

Saturday, February 15, 2014

Composite Design Pattern

The Composite design pattern is a wonderful tool for dealing with sets of objects that can encapsulate other objects. Generally speaking, if you can describe a set of object compositions in terms of a tree, it is probably a good candidate for the Composite design pattern.

The idea is actually quite simple. You can have two types of object, a Leaf or primitive object, and a Composite object. The Leaf object does not encapsulate any other objects, but the Composite does.

An interface is provided that treats all the objects, Leaf or Composite, as the same type. Using this general interface, clients pass commands to the objects without respect to whether or not the given concrete object is a Leaf or a Composite. If the concrete object is a Leaf, then the command requested is executed directly. If it is a Composite, then the command is propagated to the children objects. This children, may be Leaf objects or they may be Composite objects themselves. In the latter case, the command is further propagated out through all of the Composite objects until it reaches all of the Leaf objects and thus traverses the entire tree of objects.

This pattern allows you to create very general code where the client of the interface you provide is spared any ugly details about what type of object it is dealing with. The pattern is often used in graphical applications, but is very general and can be applied to a large variety of problem domains.
It is a great pattern and one I will strive to use more frequently in my code.

An Aside on Specs

For this class, I ended up writing too proposals. These proposals weren't really specs, but they were something in-between a spec and a marketing document.

I have written a few specs in the past, and every single time I remember having to "update" the spec after the fact to deal with issues that we encountered during development. Which seems, somewhat counter to what a "spec" classically is, a description of the software system with exacting detail. The details are of course the problems, because as it turns out in reality the only true description of the software system or protocol or what have you, is the implementation of said system.

This isn't to say that specs are useless, they are incredible useful. They are a guide that should be followed within reason, and updated when they describe something that is either impossible or highly impractical.
I'll finish this aside with a email that I say on the Linux Kernel Mailing lists a while ago and I stumbled across again today.

From: Linus Torvalds

To: Arjan van de Ven Subject: Re: I request inclusion of SAS Transport Layer and AIC-94xx into the kernel Date: Thu, 29 Sep 2005 12:57:05 -0700 (PDT) Cc: Willy Tarreau , SCSI Mailing List , Andrew Morton , Linux Kernel Mailing List , Luben Tuikov , Jeff Garzik Archive-link: Article, Thread
On Thu, 29 Sep 2005, Arjan van de Ven wrote:

a spec describes how the hw works... how we do the sw piece is up to us ;)
How we do the SW is indeed up to us, but I want to step in on your first point.
Again.
A "spec" is close to useless. I have never seen a spec that was both big enough to be useful and accurate.
And I have seen lots of total crap work that was based on specs. It's the single worst way to write software, because it by definition means that the software was written to match theory, not reality.
So there's two MAJOR reasons to avoid specs:

they're dangerously wrong. Reality is different, and anybody who thinks specs matter over reality should get out of kernel programming NOW. When reality and specs clash, the spec has zero meaning. Zilch. Nada. None.
It's like real science: if you have a theory that doesn't match experiments, it doesn't matter how much you like that theory. It's wrong. You can use it as an approximation, but you MUST keep in mind that it's an approximation.

specs have an inevitably tendency to try to introduce abstractions levels and wording and documentation policies that make sense for a written spec. Trying to implement actual code off the spec leads to the code looking and working like CRAP.
The classic example of this is the OSI network model protocols. Classic spec-design, which had absolutely zero relevance for the real world. We still talk about the seven layers model, because it's a convenient model for discussion, but that has absolutely zero to do with any real-life software engineering. In other words, it's a way to talk about things, not to implement them.
And that's important. Specs are a basis for talkingabout_ things. But they are not a basis for implementing software.

So please don't bother talking about specs. Real standards grow up despite specs, not thanks to them.
  Linus

Wednesday, February 12, 2014

Touchable

Having become largely disillusioned with my initial project, I decided to propose an entirely different product. Much of the feedback I initially received dealt with general issues that are applicable to many projects and I tried to address them here in this proposal. Admittedly, it is somewhat of a gamble to entirely change this late in the game, but I feel very strong about this new product so I believe it is worth the risk.

Feedback is welcome!

http://cs.unm.edu/~zdevex/touchable.pdf

Saturday, February 8, 2014

Software Craftsmanship

I have often found myself thinking about the ideas of Software Craftsmanship, well before 460. The news outlets theses days are overrun with stories of poor software causing tremendous amounts of damage, both financial and otherwise. As someone studying computer science, I know all too well that we in the field are to blame for this current climate of dangerous programs.

There have been many many studies on proper coding practices, as well as proper project management. The strong consensus of these studies has shown that taking time to write quality (maybe not perfect) software is not only more secure and stable, but it also costs less money and time to create. It is better on all fronts, there is no downside. Of course one can not go overboard in this process, and spend weeks upon weeks writing ten lines of code, but simply writing code with reckless abandon is a known recipe for destruction.

Despite it being well known that there are practically no disadvantages to writing high quality code, people continue to hack things together in unmaintainable, unexpandable, insecure ways. I can't understand why we as humans do this. What is wrong with us? Are we so naive that we believe that we as programmers don't need to follow proper practices, because we think we are above these practices?

Friday, February 7, 2014

3 words

Primary: Mature, Responsible, Honest

Secondary: Creative, Adaptive, Hardworking

Tuesday, February 4, 2014

Proposal Review : Carpooling 505

Here is a link to the proposal I reviewed, Carpooling 505.

http://cs.unm.edu/~zdevex/proposalReview0.pdf

Monday, February 3, 2014

Proposal

http://cs.unm.edu/~zdevex/proposal.pdf

Saturday, February 1, 2014

Agile Reactions

Recently in my software engineering class we discussed the ideas of Agile Software Development. The discussion seemed to illuminate several things to me, primarily that there is this good idea called Software Requirements, there is this good thing called Agile Software Development, and there is this bad thing which is the common programmer's perception of each.

Often when I encounter people who advocate for the Agile style, they function as if Agile means the following.

Don't comment your code (because your code is self documenting).
Don't write specs, because that isn't flexible.
Start coding right now (Like if you are still reading this sentence and not coding you aren't doing agile right).
Don't write tests (Because hey I have madz hacking skillz, my code doesn't have bugs).

When I encounter people who advocate for the more classical approach (which is not too often these days) they seem to think the following.

Plan everything before writing any code.
Understand everything before doing anything.
Never change from the spec/standard.

From our discussion, it really seems like real Agile and Requirements based development are two sides of the same coin, and not either of the two concepts I described above. Requirements does seem to focus more on, figure out what you are going to do before you start coding, while Agile does seem to emphasize flexibility more, but it seems like they are both reactions to the same thing, poor programming. You need to know what you are going to build before you build it, you need to write tests, plan, and define the problem space. At the same time you need to be flexible with your client and their needs, as much as is possible. You need to not spend all your time planning, but you do need to spend some of it planning. At the same time you do need to actual create a product at some point.

In reality it seems that Agile and Requirement are both abstractions on good design and engineering practice, which isn't to detract from them at all. After all this is Computer Science abstraction is our most powerful tool.

Tuesday, January 28, 2014

Project Timeline

This blog article serves as the initial reference timeline for my project. Initially I was going to attempt to post something visual, like a PNG or PDF of a graphical timeline, but I have been unable to find a way to do so that I feel is suitable to be read on a blog. I am exploring ProjectLibre to become the general tool for managing our project, but it didn't seem feasible to upload anything from that program into this blog (Although maybe it is and I just didn't see it!).

Timeline

The timeline is focused around central Milestones. This being the case, before I go into the dates on the calendar I want to outline what those Milestones will be (Subject to change of course.)
For the time being, I am only listing must have milestones. If we are ahead of schedule, there will be several additional milestones we target as well. These I will define at a later date.

Milestones

Milestone 1
- After discussing with team, decide on concrete technologies (web server type, programming language, target platforms) for project.
Milestone 2
- Have initial documentation for Social Network Protocol complete.
- Have initial documentation for Social Network Aggregator Application Complete.
- Decide on better names for both of the above.
Milestone 3
- Finish prototype implementation of Social Network Protocol.
- Provide test suites for prototype implementation.
- Provide deliverable quality Social Network Protocol documentation.
- Provide working draft quality Social Network Aggregator Application documentation.
Milestone 4
- Complete core libraries for reading from Social Network Protocol in the Social Network Aggregator Application.
- Complete at least one handler for at least one social network.
- Provide test suites for core libraries and social network handlers
- Provide complete Social Network Aggregator Application documentation.
Milestone 5
- Complete core GUI for prototype implementation.
- Provide test suites for GUI codebase.
- Provide alpha 0.1 release of application.
Milestone 6
- Provide alpha 1.0 release of the application.
- Provide addition test suites for all codebases.
- Have website up to market the application/protocol.

Dates

Note, events listed as Completed on a given week, must be completed by the end of said week. Events listed without a Completed identifier, should be worked on during that week. Events that are not completed in a given week still require visible progress on the event to be shown to the client during the meets. This is to say that all weeks must be moving towards delivery (There are no weeks off!)

Week 1
- Milestone 1 - Completed
- Meeting with client. - Completed
  - Inform the client of the specific technologies that we will be using and motivate their use. That is to say, we can't say we are writing this program in Lisp, just because Lisp macros are off the hook.
Week 2
- Milestone 2 - Completed
- Meeting with client. - Completed
  - Present documentation for both the application and the protocol.
  - Get feedback on functionality that might be added/removed/made optional etc.
Week 3
- Meeting with client. - Completed
  - Show client updates to documentation.
  - Demo any prototypes that we currently have (Optional this week)
- Milestone 3
Week 4
- Meeting with client. - Completed
  - Show client updates to documentation.
  - Demo prototype to client (Does not have to be complete, but must show something.)
- Milestone 3
Week 5
- Meeting with client. - Completed
  - Show deliverable quality documentation for Social Network Protocol.
  - Show updates to Social Network Aggregator documentation.
  - Demo final prototype Social Network Protocol implementation.
  - Show that demo passes test cases.
- Milestone 3 - Completed
Week 6
- Meeting with client. - Completed
  - Show updates to Social Network Aggregator Documentation.
  - Inform client of the first social network we will target for handler.
- Milestone 4
Week 7
- Meeting with client. - Completed
  - Show updates to Social Network Aggregator Documentation (Should be pretty much done at this point.)
  - Show visible progress on core library implementation. Passing test cases, pulling data, etc.
  - Show visible progress on social network handler. Passing test cases, pulling data, etc.
- Milestone 4
Week 8
- Meeting with client. - Completed
  - Demo core libraries and social network handlers.
  - Provide complete documentation set.
  - Show code passes our test cases.
  - Show GUI design ideas to client and get feedback.
- Milestone 4 -- Completed
Week 9
- Meeting with client. - Completed
  - Discuss marketing with client. (Website, etc.)
  - Show GUI prototypes to client.
- Milestone 5
Week 10
- Meeting with client. - Completed
  - Show more GUI progress to client.
- Milestone 5
Week 11
- Meeting with client. - Completed
  - Show final 0.1 alpha version to client.
  - Discuss final marketing information.
- Milestone 5 - Completed
Week 12
- Meeting with client - Completed
  - Show final 1.0 alpha version to client.
- Milestone 6 - Completed

Terms

Social network handler
A piece of code that exposes a given social network through the Social Network Protocol.

Version numbers x.y
Changes in x reflect major releases. Changes in y reflect bug fix releases. Version numbering starts at 0.1

Sunday, January 26, 2014

Concept Paragrah

    In the past ten years social media has become one of the most dominant tools for interaction and self expression. Attempting to capitalize on this trend, many different forms of social media have emerged. Some have replaced older incarnations that have now passed their prim, such as Facebook and MySpace. Others coexists together, serving similar but distinct services, such as Twitter and Instagram. Keeping track of all of these different information sources can be a daunting task, especially if one wishes to continue to have a life outside of social media in the real world.

   Several programs have attempted to aid users in aggregating and viewing their various social media feeds. These include such tools as Flipboard, HootSuite, HTC Blink Feed, among others. While these tools are generally well done, they do generally miss the mark. This is evidence by the fact that people still most often use the Facebook and Twitter app rather than the Flipboard app. The general problem with these apps is that they strive too hard to be "cool" rather than be useful. For instance, the Flipboard app displays the various feeds in an artistic tile layout. This type of layout can be good for some types of user interfaces, but is not what is generally desired for reading social media.

   What is needed is a simple, clean, friendly, application which aggregates data from arbitrary social media type sources. Something that will feel familiar and easy to use, no matter what set of social media applications a user wishes to integrate. This application needs to be above all useful and intuitive in data presentation, and secondarily it needs to offer seamless ability to interface with the various features of given social networks, such as image upload or adding a friend to your contacts.

   To accomplish this goal, I propose two discrete steps. The first is the establishment of a general purpose protocol to describe social media interactions. This will be similar in concept to classical RSS, but more flexible and rich so as to allow more diverse types of information to be retrieved and more complicated interactions between client and server, such as uploading an image. With the establishment of this protocol, we will implement an front-end that will create a one stop shop for social media interaction.

   Further, this protocol could be extended to implement new meta-features operating inside existing social media applications. These meta-features will provide additionally functionality through the existing social networks. For example, crawling social media for the purposes of web archiving is currently a non-trivial task. Social media environments often rely heavily on dynamic technologies, such as JavaScript and AJAX, which are difficult to handle while crawling. This protocol could provide an easily crawl-able interface for social media, allowing trivial web archiving of social media. Another example might be the implementation of a crypto layer inside the social network. Our application could post gpg encrypted messages to an arbitrary social network, which could then be read by select users of our application who also have access to the lower level social network.

   Users of this application will no longer need to have N amount of applications in order to be connected to their friends, they will only need our application. Further we will not try to awe them with a fancy interface alone (Although our interface should be attractive.) but rather provide them with powerful meta features that they will not be able to use inside the lower level social network itself.

   The protocol we develop will be open sourced and reference implementations given to the community. The application we create will be offered for free, with either a donation option, or a payed version with additional features. Our motivation for this method of support comes from the notion that people are unwilling to purchase a new application like this without trying it first to see how useful it is.

   It is my belief that both the protocol and application built on it will be very useful to a variety of consumers. Wouldn't you like to have a free and feature rich way to interact with your social networks that also aids the open source community?

Saturday, January 25, 2014

Practical Style

In regard to my last post about the importance of a consistent code style in a VCS environment, I wanted to post a concrete example.

Consider the following code,


#include <stdio.h>
#include <stdlib.h>

int
printMessage(char* message);

int
getName(char* message);

void
checkExit(int code);

int
getName(char* message)
{
  char* name = malloc(sizeof(char)*10);
  int i = 0;
  int err = 0;
  char c = 0;
  while((c = getchar()) != '\n')
    {
      name[i] = c;
      ++i;
    }
  err = printf("%s %s\n", message, name);
  free(name);
  return err;
}

int
printMessage(char* message)
{
  return printf("%s\n", message);
}

void
checkExit(int code)
{
  if(code < 0)
    exit(code);
  else
    return;
}

int
main(int argc, char* argv[])
{
  int err = printMessage("What is your name?");
  checkExit(err);
  err = getName("Hello,");
  checkExit(err);
  return 0;
}

This is a simple little C program that will read in your name and print it back at you. It does have one issue (at least, I wrote it rather quickly). In the function getName function there is no check to make sure you aren't reading over the end of the buffer (In C you must worry about such primitive things!). Now, let's say another programmer with a different sense of style comes in to fix this later, but he also changes the code to match his personal style preferences, no camel-case, function return types on the same line as the function name, different feelings about braces, etc. Below are his changes.


#include <stdio.h>
#include <stdlib.h>

int printmessage(char* message);

int getname(char* message);

void checkexit(int code);

int getname(char* message) {
  char* name = malloc(sizeof(char)*10);
  int i = 0;
  int err = 0;
  char c = 0;
  for(; i < 10 && (c = getchar()) != '\n'; i++)
    {
      name[i] = c;
    }
  err = printf("%s %s\n", message, name);
  free(name);
  return err;
}

int printmessage(char* message) {
  return printf("%s\n", message);
}

void checkexit(int code) {
  if(code < 0)
    exit(code);
  else
    return;
}

int main(int argc, char* argv[]) {
  int err = printmessage("What is your name?");
  checkexit(err);
  err = getname("Hello,");
  checkexit(err);
  return 0;
}

Now semantically are identical, except for fixing that buffer length check in getname (formally "getName"), but look at the diff that is produced from them.

4c4,5
< int printmessage(char* message);
---
> int
> printMessage(char* message);
6c7,8
< int getname(char* message);
---
> int
> getName(char* message);
8c10,11
< void checkexit(int code);
---
> void
> checkExit(int code);
10c13,15
< int getname(char* message) {
---
> int
> getName(char* message)
> {
15c20
<   for(; i < 10 && (c = getchar()) != '\n'; i++)
---
>   while((c = getchar()) != '\n')
17a23
>       ++i;
24c30,32
< int printmessage(char* message) {
---
> int
> printMessage(char* message)
> {
28c36,38
< void checkexit(int code) {
---
> void
> checkExit(int code)
> {
35,39c45,51
< int main(int argc, char* argv[]) {
<   int err = printmessage("What is your name?");
<   checkexit(err);
<   err = getname("Hello,");
<   checkexit(err);
---
> int
> main(int argc, char* argv[])
> {
>   int err = printMessage("What is your name?");
>   checkExit(err);
>   err = getName("Hello,");
>   checkExit(err);

Whoa! That is a large diff for such a small semantic change! Take a look at the diff that would have been produced if the style had not changed.

20c20
<   for(; i < 10 && (c = getchar()) != '\n'; i++)
---
>   while((c = getchar()) != '\n')
22a23
>       ++i;

A diff like this is much more clear.

Hopefully this illustrates why choosing and sticking to a particular code style is important in a VCS environment.

Standard Style in a VCS environment

Recently I have been thinking a lot of the important of standards in regards to programming. I have long felt that people should make the function of their programs perform according to a standard, and that they should make extensive use of contractual programming through interface type mechanisms (which I also view as coding according to a standard). I have not often thought about how the code's style should conform to a particular standard.
A little clarification is in order. By "style" I do not mean c90 vs c99 or Java 6 vs Java 7, but something more like the following.


if(a == b){
    a++;
}
return true;


if(a == b)
     a++;
return true;

In most languages the two sets of statements above are semantically identical, so choosing to write one over the other is generally thought of as matter of preference. Now, I am well aware that there are people out there who lobby, perhaps with good reason, that one style may be superior to another. The hypothetical merits of either method is not what I wish to consider here, but rather the consistency of either one's use. What I really want to address is the consistent use of a particular style in a team environment and more specifically when using Version Control.
If one programmer on a team creates a file according to their own personally, and equally valid, style preferences, it will inevitably be edited by a different programmer with different preferences at some point in the future. Let's suppose that this future programmer, while adding some feature, also changes the file to match his or her preferred style. Then he commits his changes to upstream. This is bad.
Why is this bad? Because the diff for the changes to the file will not include only the new feature related changes of the second programmer, but the all of the semantically meaningless changes to the style of the code. This makes understanding what has actually happened between one commit and another very difficult to discern at a glance.
This may and hopefully does seem obvious to anyone reading this article, but it is not something I had considered until recently.

Thursday, January 23, 2014

Reactions to Wikipedia article on Software Engineering

The article on Wikipedia about software engineering seem to speak to me of one thing in general. Although people are being taught "Software Engineering", being hired as "Software Engineers", and writing books on the subject, people are still not quite sure what "Software Engineering is. Particularly, people seem to be unclear on how to separate it from Computer Science. At first glance, it seems to be some formalization or the practical application of Computer Science, but if this is the case does this mean that Computer Science is not in and of itself practically applicable? Perhaps...

And in other ways it seemed as though the article described Software Engineering as a commercialization of purely academic Computer Science.

Much of the article focused on discussing aspects of Programming that related to reusable, correct, code, such as testing, design, following standards, etc. Maybe the practices thought in Software Engineering could be summed up as teaching Computer Scientists how to not be lazy programmers? It does seem that this concept may be at the heart of the field. After all, an truly overwhelming amount of data shows how much of a negative effect that lazy and sloppy programming, testing, and documentation can have on a project, and these are areas that often even the most brilliant Computer Scientists struggle with.