[utah-devel] Improvement suggestions

Tue Feb 12 16:19:51 UTC 2013

Hello all,

In this e-mail I'd like to summarize some of the things that aren't 
working fine and some suggestions to improve the situation. Some the 
things below have been already discussed outside this maillist, so I'm 
not claiming that I'm the author of every suggestion. I'm just trying to 
bring them together, so that we can discuss and prioritize for the 
future on things that are important, but are usually delayed because 
they aren't urgent. In particular, this is the criteria I've tried to 
follow in the issue ordering.

- Coding guidelines
As a team of developers working on the same project, we need some 
guidelines to provide consistency to our code. There's something already 
written here:
https://utah-dev.readthedocs.org/en/latest/development.html#coding-guidelines-for-developers

but I believe isn't enough. For the past few months, we've done some 
good improvements with regard to PEP8 and started to take care of PEP257 
as well, but still the code is far from being homogeneous. Some examples 
of what I mean:
- Variable names (myvariablename vs. my_variable_name)
- String formating ('%s' % my_string vs. '{}'.format(my_string)).
- String quotes ('my string' vs. "my string").

- Unit test cases
I've always found a contradiction that testing tools aren't tested as 
much as software that is delivered to the customer. If we want to be 
able to redesign, refactor, add new features, etc.; we need to be sure 
we're not breaking anything and right now we don't really have a strong 
guarantee about that. Also note that being python a dynamically typed 
language, unit tests are supposed to detect (almost) all the problems 
that a type system would detect at compilation time, but this is just 
not happening now.

- Documentation
Documentation is still sparse and there are some aspects of how UTAH 
works that make it look like a black box. We need more documentation for 
users and developers and more control on what gets merged to avoid 
getting changes that modify behavior without an update on the related 
documentation. In addition to this, we need up-to-date design 
documentation that provides a high level view of how is UTAH supposed to 
work before getting into low level details.

- Command line
The command line binaries that we offer to the user are difficult to use 
and learn. Ideally a user should only care about `run_utah_tests.py` for 
the server and `utah` for the client, but there are other binaries and 
it's not clear why they are there.

We should work towards having something that is friendlier and easy to 
remember. In my opinion, we have in VCSs a good example about how a 
piece of software with lots of commands can be exposed to the user in a 
uniform way (I'm not very familiar with cobbler, but it feels similar as 
well). My point is that `utah <subcommand>` should be the only thing 
that the user should remember. Anyway, I understand is difficult to 
abstract all the complexity of the actions that are supported, but we 
should put more effort into this.

- Configuration
This is somewhat related to the command line interface, but also to the 
different ways of configuring UTAH's behavior that are available. Right 
now we have command line parameters, a default configuration file, a 
configuration file set in an environment variable and a directory with 
configuration files set in another environment variable. The problem 
about this is that not everything fits nicely being one example what we 
need to do to set up different log files in jenkins jobs.

Aside from this, I think some options aren't well documented and/or used 
in multiple contexts (for example, timeout values), which leads to more 
confusion.

- Packaging
UTAH packages contents is confusing and not very easy to map to the 
branch layout. One example of this is that both `utah/exceptions.py` and 
`utah/client/exceptions.py` are in the `utah-client` package.

This isn't a big problem per se, but a symptom that we're not very close 
to the ideal of low coupling and high cohesion and that, consequently, 
for every new feature that is added, the developer has to deal with a 
complexity that grows exponentially.

In my opinion having everything in the same branch has been good to work 
faster, but now that the project is getting big, we should think about 
having different branches (and projects) or, at least, one branch in 
which modules are not so interrelated.

- Other low level issues
There are some other pending issues the refactoring of the Machine 
classes, improvements to the inventories implementation to deal with all 
kind of systems that can be provisioned easily, etc.

Please let me know your thoughts on this.

Best regards,
     Javier