Saturday, December 27, 2014

My thoughts on Rust in 2015

Disclaimer: these are my personal thoughts on where I would like to see Rust go in 2015 and what I would like to work on. They are not official policy from the Rust team or anything like that.

The obvious big thing coming up in 2015 is the 1.0 release. All my energy will be pretty much devoted to that in the first quarter. From the compiler's point of view, we are in pretty good shape for the release. I'll be mainly working on associated types (with Niko) - trying to make them as feature complete and as bug free as possible. This is primarily motivated by the standard library; we want to provide everything the standard library needs here. Other things on my work list are a number of syntactic changes (`?Sized`, slicing (this is not purely syntactic), other minor changes as they come up) and coercions (here, there are a few backwards incompatible things, and providing a really solid, ergonomic story around coercions is important). Any other time is likely to be spent on 'polish' things such as improving error messages and refactoring the compiler in minor ways; possibly even some hacking on the libraries if that would be most useful.

We (the Rust community) also need to do some planning for post-1.0 Rust. There are obviously a lot of features people would like post-1.0 and it would be chaos to try to implement them all asap. So, we need to decide what is in scope for the immediate future of Rust. My personal preference is to focus on stability for a while and only add a minimum of new language features. Stability for me means building better, broader libraries, supporting the ecosystem (work on Cargo and, for example), fixing bugs, and making the compiler friendlier to work with - both for regular users (error messages, build times, etc.) and for tooling.

'A minimum of language features' will probably mean quite a few new things actually, since we postponed a lot of work due to 1.0 which are pretty high priority. UFCS, custom DST coercions, cross-borrowing coercions, and being able to return a bare trait from a function are high on my wish list. There are a couple of large features which seem debatable for immediate work - higher kinded types and efficient inheritance. Ideally, we would put both of these off, but the former is in great demand for improving the standard libraries and the latter would really help work on Servo (and also improve the compiler in many ways, depending on the chosen solution). Both of these need a lot of design work as well as being big implementation jobs.

Finally, the biggest piece of Rust making me uncomfortable right now is macros. We will have limited macro support for 1.0. I would like to start looking at macro rules 2.0 and a better system for syntax extensions as soon as possible. The longer we leave this, the more people will use and get used to the existing solution or start using external tools to replace syntax extensions.

My vision for macro rules, is basically a tweaked version of today's. Fundamentally, I think hygiene needs to be built into everything from the ground up, including more sophisticated 'type' annotations on macro arguments. Unfortunately, I think this means a lot of work, including rewriting the compiler's name resolution machinery (but then we want to do that anyway).

Syntax extensions/procedural macros have many more open questions. I'm not sure how to make these hygienic and not too painful to write. There is also the question of the interface we make available. Using the compiler's AST and giving extensions access to the entire compiler is pretty awful. There are a number of alternatives: my preference is to separate libsyntax's AST from the compiler's and make the former available as an interface, along with a library of convenience functions. Other alternatives are using source text or token trees as input and output, or relying much more heavily on quasi-quoting.

Looking further ahead, I don't have too many big language features I'm wanting to push forward (efficient inheritance and nested enums/refinement types, being the exception; perhaps parameterised modules at some point). There are a bunch of smaller ones I'm keen on though (type ascription, explicit lifetimes in the syntax, parameterising with ints to make fixed length arrays more usable (which is related to CTFE), etc.), but they are mostly low priority.

I am keen, however, on pushing forward on making the compiler a better piece of software and on improving support for tooling. I think these are two sides of the same coin really. Some ideas I have, in rough order of importance/priority/ease of implementation:

  • fix parallel codegen. This is currently broken with what I suspect is a minor bug. There is another bug preventing having it turned on by default. Given that this halves build times of the compiler for me, I am keen to get it fixed.
  • Land incremental codegen. Stuart Pernsteiner came super-close to finishing this. It should be relatively easy to finish it and land it. It should then have a massive effect on compilation performance.
  • Improve the output of save-analysis to be more useful and general purpose. I hope this can become an API of sorts for many compiler-based tools, especially in the medium term, before we are ready to expose a better and more permenant API.
  • Separate the libsyntax and compiler ASTs. This is primarily in support of the macro changes described above, but I think it is important for allowing long term evolution of the compiler too.
  • Refactor name resolution. Again important for macro reform, but also it is a confusing and buggy part of the compiler.
  • Make the CFG the first class data structure for all stages from borrow checking onwards.
  • Refactor trans to use its own IR, i.e., make it a series of lowering steps: CFG -> IR -> LLVM. This should hopefully make trans easier to understand and extend, and allow for adding our own optimisation passes.
  • Refactor metadata. This is a really ugly part of the compiler. We could make much better use of metadata for tools (there is a lot of overlap with save-analysis output and debuginfo, and it is used by RustDoc), but it is really hard to use at the moment. It is also crucial to have good metadata support for incremental compilation. I hope metadata can become just a straightforward serialisation of some compiler IR, this may mean we need another IR between the compiler's AST and the CFG.
  • Incremental compilation. This is a huge job of work, but really necessary for a lot of tools and would go along way to solving our compile time problems. It is very dependent on a better compiler architecture.

I guess that not all of this will get done in 2015, but I can dream...

Side projects

Some things I want to work on in my spare time (and possibly a little bit of Mozilla time, if it is prioritised right): get DXR up and going. Frustratingly, this was working back in June or July, but then DXR underwent a massive refactoring and I had to port across the Rust plugin. That has dragged on because I have been low on both time and motivation. But it is such a useful tool that I am keen to get this working, and it is getting close now. Once it is 'done' there are always ways to make it better (for example, my latest addition is showing everything imported by a glob import).

Syntax extensions on methods. I hacked this up on the plane to Portland last month, but fixing the details has taken a long while. I think it is ready now, but I need to implement something that uses it to make sure that the design is right. This project meant changing a few things with syntax extensions, and I hope to deprecate and remove some of the older, less flexible versions. Plus, there are some other follow up things.

The motivation for syntax extensions on methods is libhoare. I want to be able to put pre- and postconditions on methods. I have this working in some cases, but there is a lot of duplicate code and I think I can do better. Perhaps not though. Perhaps it can also inform some of the design around libraries for syntax extensions.

A while back I thought of adding `scanln`, a compliment to `println` using the same infrastructure, but for easy input rather easy output. I ended up forgetting about this because the RFC for adding it was rejected due to no out-of-tree implementation, but my implementation required big changes to the existing `format!` support. I would like to resurrect this project, since lack of really easy input is the number one thing which puts me off using Rust for a lot of small tasks. I believe it also raises the bar for Rust being taught at universities, etc.

I have some crazy ideas around a tool that would be a hybrid of grep/sed-with-knowledge-of-syntax-and-types, refactoring tools, and a Rustfix/Rustfmt kind of thing. It seems I need this kind of thing a lot - I spend a lot of time on tasks which are marginally more complicated than sed can handle, but much easier than needing a full refactoring tool.

Tuesday, December 23, 2014

I was getting frustrated trying to map people's irc nicks to their GitHub usernames (and back again). I assume other people were having the same problem too. It's pretty hard to envisage a good technical solution to this. The best I could come up with was having a community phone book for the Rust community. I had been meaning to experiment a bit with some modern web dev technologies, so I thought this would be a good opportunity.

Some of the technologies I was interested in were the modern, client-side, JS frameworks (Ember, Angular, React, etc.), node.js, and RESTful APIs. I ended up using Ember, node.js, and the GitHub API. I had fun learning about these technologies and learnt a lot, although I don't think I did more than scratch the surface, especially with Ember, which is HUGE.

What made the project a little bit more interesting is that I have absolutely no interest in getting involved with user credentials - there is simply too much that can go wrong, security-wise, and no one wants to remember another username and password. To deal with this, I observed that pretty much everyone in the Rust community already has a GitHub account, so why not let GitHub do the hard work with security and logins, etc. It is possible to use GitHub authentication on your own website, but I thought it would be fun to use pull requests to maintain user data in the phone book, rather than having to develop a UI for adding and editing user data.

The design of follows from the idea of making it pull request based: there is a repository, hosted on GitHub, which contains a JSON file for each user. Users can add or update their information by sending a pull request. When a PR is submitted, a GitHub hook sends a request to the backend (the node.js bit). The backend does a little sanity checking (most importantly that the user has only updated their own data), then merges the PR, then updates the backing database with the user's new data (the db could be considered a cache for the user data repository, it can be completely rebuilt from the repo when necessary).

The backend exposes a web service to access the db. This provides two functions as an http API (I would say it is RESTful, but I'm not 100% sure that it is) - search for a user with some string, and get a specific user by GitHub username. These just pull data out of the database and return it as JSON (not quite the same JSON as users submit, the data has been processed a little bit, for example, parsing the 'notes' field as markdown).

The frontend is the actual webpage, which is just a small Ember app, and is a pretty simple UI wrapper around the backend web service. There is a front page with some info and a search box, and you can use direct links to users, e.g.,

All the implementation is pretty straightforward, which I think verifies the design to some extent. The hardest part was learning the new technologies. While using the site is certainly different from a regular setup where you would modify your own details on the site, it seems to be pretty successful. I've had no complaints, and we have a fair number of rustaceans in the db. Importantly, it has needed very little manual intervention - users presumably understand the procedures, and automation is working well.

Please take a look! And if you know any of those technologies, have a look at the source code and let me know where I could have done better (also, patches and issues are very welcome). And of course, if you are part of the Rust community, please add yourself!

Monday, December 15, 2014

Notes on training for sport

I like training for sports. I have rock climbed a lot, and done a bit of swimming and kick boxing. I wouldn't say I'm very good at any of those, but I have definitely improved a lot. I think some of what I learned might be interesting, so here it is. None of this is very scientific, especially since the number of participants in the study is one. There are also lots of better sources - articles by coaches and sports scientists, rather than amateurs like me. Still, if you are interested, read on.

* Don't get injured. Injury prevention should be your number one goal when training. That means knowing your limits, warming up, and doing 'pre-hab' exercises (these are exercises that don't directly get you closer to your goals, but reduce the risk of injury by training the antagonistic muscles or improving mobility, etc). The weeks you miss due to injury will affect progress more than any other factor in your training.

* Training must by super-specialised. Training works because the body adapts to the pressures put on it by training. But that adaptation is much more specialised than you might think, especially once you get more advanced in your training. This can lead to some surprises. For example, most climbs are most demanding on the fingers (but see section on technique, below), that means doing pullups will not help you achieve these kinds of climbs. Likewise, be very precise about where you are training on the power-endurance spectrum, being able to run for many kms will not help you run 100m any faster.

This also goes for training the antagonists. For example, doing pressups or bench presses makes climbers more imbalanced, not less. This is because it is usually the shoulders which cause more trouble than the elbows (those exercises are good for balancing the elbows). Rebalancing the shoulders requires exercises that bring the shoulder blades back and down, such as rowing-type exercises. Balancing the elbow is also interesting - climbing tends to overuse the brachioradialis (used for pull ups with the palms out or curls with the palms down) vs the biceps, so doing curls (palms up) can improve muscle balance in the elbow, even though the biceps are usually thought of as a climbing muscle.

Also, yoga is terrible for climbers, balance-wise.

* Technique is important. You always think your technique is good enough, but it can usually be better. This is obvious for technique-based sports like climbing and kick boxing, but it holds true for basically all exercise, even lifting weights - my biggest gains in the bench press and deadlift came from technique coaching, and that was starting from 'good' technique.

* Stretch when warm. Stretching is not a warm up and stretching cold is really bad, I got injured this way a few times. This is important for yoga in particular, you really need to go gentle until you are warmed up.

* To train effectively you need to repeat the same thing and make it progressively more difficult. Doing one session of an exercise doesn't make any difference, you have to do that session once a week (or more) for six weeks (or more) and keep increasing resistance or going for longer. However, if you keep doing the same thing for too long, you'll hit a plateau and won't improve. I found this hardest with the things I have most fun doing - I don't want to change them up because I love doing it, but if you don't make changes, you don't keep improving.

* Protein is the king of nutrients, at least as far as training is concerned. I had this vividly illustrated when I turned vegetarian, my performance dropped and I lost muscle mass. I solved that problem with protein shakes and I think getting plenty of protein is the best thing you can do for you diet when training hard.

There is a meme that too much protein is bad for you in some way, but I don't think that is true, at least as long as you have an otherwise balanced diet (plenty of fibre, vitamins, etc.). Research linking excess protein to kidney damage only indicates that if you already have kidney damage, then excess protein can make it worse. No research (afaik) indicates that excess protein can cause kidney damage in the first place.

I haven't found any other supplements to be anywhere near as worth while. BCAAs seem to have a significant but small effect. Some carbs in drinking water when training also seems to help a little. Creatine made a big difference, but the weight gain (in water retention) meant it was not worth it for climbing (except maybe to escape a plateau), and when you stop taking it the withdrawal is harsh, training-wise.