Difference between revisions of "Programming Guidelines"
(CxEOpqIcTI) |
m (→Error-handling: Actual comment) |
||
(12 intermediate revisions by 9 users not shown) | |||
Line 1: | Line 1: | ||
− | This | + | Initial starting point: [http://www.livejournal.com/doc/server/ljp.index.html LJ Programming Guidelines]. |
+ | |||
+ | {{Note|text=There is a summary of these guidelines at [[Programming Guideline Checklist]].}} | ||
+ | |||
+ | Now, let's talk a bit about how Dreamwidth code should be written. Note that this is the stylistic guideline to follow to make sure the code looks uniform. If you submit a patch that doesn't follow these guidelines, then it will likely get declined with a note to review this documentation. Realistically, we want everybody to be able to dive into the code and understand what is going on without having to try to interpret a dozen different coding styles. | ||
+ | |||
+ | Note that these are not hard and fast rules. A lot of the time you will be fixing up code somewhere that is broken, and you will want to follow the style of the code you are working in. If you're editing really old pages like talkread.bml then you should attempt to code legibly, but follow the style in use on the page. (Mixing styles within a file can be problematic.) | ||
+ | |||
+ | '''These are guidelines, and not to be afraid of!''' Please don't feel intimidated by the style of the code. People will help you out if you have questions or aren't sure how to write something. | ||
+ | |||
+ | == Whitespace == | ||
+ | |||
+ | === Spaces for indentation === | ||
+ | |||
+ | Yes, we use spaces: four per level of indentation. But use your judgment when lining things up. | ||
+ | |||
+ | === Spaces around parenthesis and operators === | ||
+ | |||
+ | Perl has a well-deserved reputation for being line noise. Let's try to avoid that by putting spaces around parenthesis and operators: | ||
+ | |||
+ | <source lang="perl"># This is right out: | ||
+ | if ((($a||$b)&&(($c>1)||$d&5))+1)>3) { ... } | ||
+ | |||
+ | # This is much better: | ||
+ | if ( ( ( $a || $b ) && ( ( $c > 1 ) || ( $d & 5 ) ) + 1 ) > 3 ) { ... }</source> | ||
+ | |||
+ | Well. It's better, but this line is terrible anyway. If you end up with a line like this, I'd suggest rewriting it entirely. Don't be too afraid of going onto multiple lines if it makes the operation more readable. | ||
+ | |||
+ | The "spaces around parenthesis" rule also applies to method calls. Here are some examples of when this comes into play: | ||
+ | |||
+ | <source lang="perl">call_method( $arg, $arg2 ); | ||
+ | |||
+ | do_something( 45 ) if $org || $neg;</source> | ||
+ | |||
+ | However, do NOT use spaces around operators used to index into arrays or hashes. It gets confusing. | ||
+ | |||
+ | <source lang="perl"># Wrong: | ||
+ | if ( $array[ 13 ] > $hash->{ $index } ) { ... } | ||
+ | |||
+ | # Right: | ||
+ | if ( $array[13] > $hash->{$index} ) { ... }</source> | ||
+ | |||
+ | === No tabs or trailing whitespace === | ||
+ | |||
+ | No tabs, ever. No trailing whitespace. | ||
+ | |||
+ | These things happen. The people reviewing code will try to catch them before commit. But don't commit tabs or trailing whitespace. Convert all tabs to spaces and prune trailing whitespace. | ||
+ | |||
+ | Tabs are not consistent between all editors and operating systems. | ||
+ | |||
+ | === Linebreak readably === | ||
+ | |||
+ | Long lines are difficult to read. Lines 72 or fewer characters long are ideal; 80-character lines are acceptable; lines over 120 characters long should probably be rewritten. | ||
+ | |||
+ | == Don't use extraneous parentheses == | ||
+ | |||
+ | Don't use extraneous parentheses. This happens mostly on method calls that don't need it and postfix conditionals. | ||
+ | |||
+ | <source lang="perl"># These are wrong: | ||
+ | my $r = DW::Request->get(); | ||
+ | do_something() if ( $r->method() eq 'POST' ); | ||
+ | |||
+ | # Fixed: | ||
+ | my $r = DW::Request->get; | ||
+ | do_something() if $r->method eq 'POST';</source> | ||
+ | |||
+ | Note that you need parentheses after bareword function calls such as the do_something() above. Otherwise Perl has no idea it's a method and not a constant or some other value. But when you call it on a reference (using the -> operator) then it's assumed to be a method. | ||
+ | |||
+ | == Logic == | ||
+ | |||
+ | === Postfix conditionals === | ||
+ | |||
+ | Use postfix conditionals. That means: | ||
+ | |||
+ | <source lang="perl"># This is incorrect: | ||
+ | if ( $something ) { | ||
+ | do_something(); | ||
+ | } | ||
+ | |||
+ | # Try this instead: | ||
+ | do_something() if $something; | ||
+ | |||
+ | # Or even: | ||
+ | do_something() | ||
+ | if $something;</source> | ||
+ | |||
+ | It is your discretion which of the forms you use. Pick whatever seems appropriate to the code. (I tend to prefer the latter personally unless the line is really short.) | ||
+ | |||
+ | '''EXCEPTION:''' Do not combine postfix conditionals with "my" statements, as subsequent code can cause the variable to be accidentally assigned in global scope. Perl's strict and warnings tests do not detect this. | ||
+ | |||
+ | <source lang="perl"># use at your own risk | ||
+ | my $foo = 'bar' if $u->is_special; | ||
+ | |||
+ | # preferred alternative | ||
+ | my $foo; | ||
+ | $foo = 'bar' if $u->is_special;</source> | ||
+ | |||
+ | === Use human logic === | ||
+ | |||
+ | This is a big one that will really help people reading the code. The human brain is good at some things, bad at others. | ||
+ | |||
+ | <source lang="perl"># if something is true is easy to understand: | ||
+ | if ( $whatever ) { ... } | ||
+ | |||
+ | # unless something is true is also easy to understand: | ||
+ | unless ( $whatever ) { ... }</source> | ||
+ | |||
+ | However, some things are NOT easy to understand, and should be rewritten. | ||
+ | |||
+ | <source lang="perl"># don't use this unless you REALLY need to | ||
+ | if ( ! $something ) { ... } | ||
+ | |||
+ | # use this instead: | ||
+ | unless ( $something ) { ... }</source> | ||
+ | |||
+ | In some cases, the variable in question lends itself to the negation syntax in the first example. But that's a rare case. Just think about it for a second and go with what flows. | ||
+ | |||
+ | <source lang="perl"># never use this | ||
+ | unless ( ! $foobar ) { ... } | ||
+ | |||
+ | # or this | ||
+ | unless ( $foobar ) { ... } else { ... }</source> | ||
+ | |||
+ | There is no situation in which "unless not $something" makes sense. The brain just doesn't parse it. It's a double negative, you might as well just use "if $something" and call it good! | ||
+ | |||
+ | == Data structures == | ||
+ | |||
+ | === Proper use of hashes === | ||
+ | |||
+ | Proper use of hashes is important. Do not quote literals unless needed. That may not make sense, so here are some examples: | ||
+ | |||
+ | <source lang="perl"># This is bad | ||
+ | $HASH{'DATA'} = 5; | ||
+ | $hashref->{'Something'} = $foo; | ||
+ | |||
+ | # try these instead | ||
+ | $HASH{DATA} = 5; | ||
+ | $hashref->{Something} = $foo;</source> | ||
+ | |||
+ | The only time that you need quotes is if the value you are quoting has non-word characters in it. I.e., dashes, quotes, etc. | ||
+ | |||
+ | ===Array length=== | ||
+ | |||
+ | The correct way to measure the length of an array is to use the "scalar" built-in function. The special variable $#arrayname is sometimes used for this purpose, but it returns the index of the last element of the array; thus the length of the array is always $#arrayname + 1. Since this inevitably leads to mistakes, it's best to avoid using that syntax entirely. | ||
+ | |||
+ | <source lang="perl">my @array = qw( apples oranges bananas ); | ||
+ | print scalar @array; # returns 3, easy to understand | ||
+ | print $#array + 1; # returns 3, hard to understand</source> | ||
+ | |||
+ | === Local variable assignment === | ||
+ | |||
+ | It has been observed that code performance suffers if multiple calls to the "shift" operator are executed. Best practice is to use list assignment instead. | ||
+ | |||
+ | <source lang="perl"># Multiple shifts: bad | ||
+ | my $var_1 = shift; | ||
+ | my $var_2 = shift; | ||
+ | my $var_3 = shift; | ||
+ | |||
+ | # List assignment: good | ||
+ | my ( $var_1, $var_2, $var_3 ) = @_;</source> | ||
+ | |||
+ | For very short functions (5 lines or less) it is reasonable to refer to the arguments array directly (using $_[0] et al) and avoid local variables entirely. | ||
+ | |||
+ | See also: [[Optimizing code]]. | ||
+ | |||
+ | == Error-handling == | ||
+ | |||
+ | There are a number of different methods of dealing with errors in the codebase. Broadly, these fall into the categories of "things that should return a polite error and carry on" and "things that can give an error message that won't make sense to most users". | ||
+ | For the latter, there are still a few different methods about, and if making minor changes to existing code you should probably respect the method that is used in that code. But for new code, use <tt>die</tt> or <tt>Carp::croak</tt>. See [https://dw-dev-training.dreamwidth.org/53222.html?thread=360934#cmt360934 this comment] from Mark for a full explanation. | ||
+ | |||
+ | == LJ Specific Conventions == | ||
+ | |||
+ | === Use LJ::is_enabled() === | ||
+ | |||
+ | Use <tt>LJ::is_enabled()</tt> instead of <tt>$LJ::DISABLED</tt>. Then you don't have to use negative logic and can pass parameters to it. <tt>LJ::is_enabled</tt> wraps <tt>$LJ::DISABLED</tt> in <tt>conf_test()</tt>. | ||
+ | |||
+ | === Preferred variable usage of $LJ::HOME === | ||
+ | |||
+ | <tt>$LJ::HOME/cgi-bin</tt> is preferred over <tt>$ENV{LJHOME}/cgi-bin</tt> -- use <tt>$LJ::HOME</tt> instead, and if you see the old form, feel free to change it. | ||
+ | |||
+ | However, some files, notably those in the bin/ directory, need to use the old form because the code file that defines <tt>$LJ::HOME</tt> isn't loaded. So, if you see something like this: | ||
+ | |||
+ | <tt>use lib "$ENV{LJHOME}/cgi-bin";</tt><br /> | ||
+ | <tt>require 'ljlib.pl';</tt> | ||
+ | |||
+ | That's trying to bootstrap the system. Until after ljlib.pl is fully loaded, <tt>$LJ::HOME</tt> is empty/undef, so we are allowed to depend on the existence of the environment variable. | ||
+ | |||
+ | === Use $u->get_cap === | ||
+ | |||
+ | <tt>$u->get_cap</tt> is preferred over <tt>LJ::get_cap($u, ...)</tt>. | ||
+ | |||
+ | But even better is adding the method as an accessor in LJ::User. Then you won't have to actually check caps every time. | ||
+ | |||
+ | == Commenting == | ||
+ | |||
+ | Please comment your new code as extensively as possible so that someone coming along behind you can follow what you're doing. If you're working with old code, and you find yourself having to puzzle through what various functions are doing, comment it up to indicate what you figure out. We want everything to be as understandable as possible for anyone who comes along after you. You don't need to comment every line, but anything of moderate complexity should have a line or two indicating what it does and why. | ||
+ | |||
+ | Code comments should be kept professional -- no profanity, no offensive or exclusionary language (in particular, the LJ code has a tendency to use the words "ghetto" or "lame" to indicate that something isn't as good as it should be, and we should absolutely avoid that and change it whenever we encounter it), no self-deprecating humor ("this is totally stupid of me but ..."), etc. This does not mean comments are required to be boring - but they should be informative, precise, and not offensive. | ||
+ | |||
+ | If you are refactoring code that contains language that does not meet with the Dreamwidth standards, please go ahead and change it as you refactor the code. We want people to be able to see our codebase and come away feeling that it's a clean and approachable system. | ||
+ | |||
+ | If you need to indicate that something will need attention again later, we've standardized on "FIXME" as our "fix me or add to me later" indicator. Always indicate why you're flagging something FIXME: | ||
+ | |||
+ | <source lang="perl"># FIXME: Works as-is, but needs foo, bar, and baz later.</source> | ||
+ | |||
+ | <source lang="perl"># FIXME: Totally broken. Needs foo, bar, and baz to work again.</source> | ||
+ | |||
+ | == English stripping == | ||
+ | |||
+ | Effective [[English-stripping|English stripping]] is a requirement! While we might not translate DW.org into other languages, other people who are using the DW code should be given the functionality to do so on their own sites. This means that you need to be conscious of how to effectively strip a page or file of the text in it. | ||
+ | |||
+ | == Source file headers and credit == | ||
+ | |||
+ | All new files you create should include the appropriate header in the file. You can see the list of sample headers for new dw-free files, new dw-nonfree files, and files that used to be 'livejournal' code but have been completely rewritten at [[File headers]]. | ||
+ | |||
+ | [[Category: Development]] |
Latest revision as of 17:33, 10 May 2017
Initial starting point: LJ Programming Guidelines.
Now, let's talk a bit about how Dreamwidth code should be written. Note that this is the stylistic guideline to follow to make sure the code looks uniform. If you submit a patch that doesn't follow these guidelines, then it will likely get declined with a note to review this documentation. Realistically, we want everybody to be able to dive into the code and understand what is going on without having to try to interpret a dozen different coding styles.
Note that these are not hard and fast rules. A lot of the time you will be fixing up code somewhere that is broken, and you will want to follow the style of the code you are working in. If you're editing really old pages like talkread.bml then you should attempt to code legibly, but follow the style in use on the page. (Mixing styles within a file can be problematic.)
These are guidelines, and not to be afraid of! Please don't feel intimidated by the style of the code. People will help you out if you have questions or aren't sure how to write something.
Contents
Whitespace
Spaces for indentation
Yes, we use spaces: four per level of indentation. But use your judgment when lining things up.
Spaces around parenthesis and operators
Perl has a well-deserved reputation for being line noise. Let's try to avoid that by putting spaces around parenthesis and operators:
# This is right out: if ((($a||$b)&&(($c>1)||$d&5))+1)>3) { ... } # This is much better: if ( ( ( $a || $b ) && ( ( $c > 1 ) || ( $d & 5 ) ) + 1 ) > 3 ) { ... }
Well. It's better, but this line is terrible anyway. If you end up with a line like this, I'd suggest rewriting it entirely. Don't be too afraid of going onto multiple lines if it makes the operation more readable.
The "spaces around parenthesis" rule also applies to method calls. Here are some examples of when this comes into play:
call_method( $arg, $arg2 ); do_something( 45 ) if $org || $neg;
However, do NOT use spaces around operators used to index into arrays or hashes. It gets confusing.
# Wrong: if ( $array[ 13 ] > $hash->{ $index } ) { ... } # Right: if ( $array[13] > $hash->{$index} ) { ... }
No tabs or trailing whitespace
No tabs, ever. No trailing whitespace.
These things happen. The people reviewing code will try to catch them before commit. But don't commit tabs or trailing whitespace. Convert all tabs to spaces and prune trailing whitespace.
Tabs are not consistent between all editors and operating systems.
Linebreak readably
Long lines are difficult to read. Lines 72 or fewer characters long are ideal; 80-character lines are acceptable; lines over 120 characters long should probably be rewritten.
Don't use extraneous parentheses
Don't use extraneous parentheses. This happens mostly on method calls that don't need it and postfix conditionals.
# These are wrong: my $r = DW::Request->get(); do_something() if ( $r->method() eq 'POST' ); # Fixed: my $r = DW::Request->get; do_something() if $r->method eq 'POST';
Note that you need parentheses after bareword function calls such as the do_something() above. Otherwise Perl has no idea it's a method and not a constant or some other value. But when you call it on a reference (using the -> operator) then it's assumed to be a method.
Logic
Postfix conditionals
Use postfix conditionals. That means:
# This is incorrect: if ( $something ) { do_something(); } # Try this instead: do_something() if $something; # Or even: do_something() if $something;
It is your discretion which of the forms you use. Pick whatever seems appropriate to the code. (I tend to prefer the latter personally unless the line is really short.)
EXCEPTION: Do not combine postfix conditionals with "my" statements, as subsequent code can cause the variable to be accidentally assigned in global scope. Perl's strict and warnings tests do not detect this.
# use at your own risk my $foo = 'bar' if $u->is_special; # preferred alternative my $foo; $foo = 'bar' if $u->is_special;
Use human logic
This is a big one that will really help people reading the code. The human brain is good at some things, bad at others.
# if something is true is easy to understand: if ( $whatever ) { ... } # unless something is true is also easy to understand: unless ( $whatever ) { ... }
However, some things are NOT easy to understand, and should be rewritten.
# don't use this unless you REALLY need to if ( ! $something ) { ... } # use this instead: unless ( $something ) { ... }
In some cases, the variable in question lends itself to the negation syntax in the first example. But that's a rare case. Just think about it for a second and go with what flows.
# never use this unless ( ! $foobar ) { ... } # or this unless ( $foobar ) { ... } else { ... }
There is no situation in which "unless not $something" makes sense. The brain just doesn't parse it. It's a double negative, you might as well just use "if $something" and call it good!
Data structures
Proper use of hashes
Proper use of hashes is important. Do not quote literals unless needed. That may not make sense, so here are some examples:
# This is bad $HASH{'DATA'} = 5; $hashref->{'Something'} = $foo; # try these instead $HASH{DATA} = 5; $hashref->{Something} = $foo;
The only time that you need quotes is if the value you are quoting has non-word characters in it. I.e., dashes, quotes, etc.
Array length
The correct way to measure the length of an array is to use the "scalar" built-in function. The special variable $#arrayname is sometimes used for this purpose, but it returns the index of the last element of the array; thus the length of the array is always $#arrayname + 1. Since this inevitably leads to mistakes, it's best to avoid using that syntax entirely.
my @array = qw( apples oranges bananas ); print scalar @array; # returns 3, easy to understand print $#array + 1; # returns 3, hard to understand
Local variable assignment
It has been observed that code performance suffers if multiple calls to the "shift" operator are executed. Best practice is to use list assignment instead.
# Multiple shifts: bad my $var_1 = shift; my $var_2 = shift; my $var_3 = shift; # List assignment: good my ( $var_1, $var_2, $var_3 ) = @_;
For very short functions (5 lines or less) it is reasonable to refer to the arguments array directly (using $_[0] et al) and avoid local variables entirely.
See also: Optimizing code.
Error-handling
There are a number of different methods of dealing with errors in the codebase. Broadly, these fall into the categories of "things that should return a polite error and carry on" and "things that can give an error message that won't make sense to most users". For the latter, there are still a few different methods about, and if making minor changes to existing code you should probably respect the method that is used in that code. But for new code, use die or Carp::croak. See this comment from Mark for a full explanation.
LJ Specific Conventions
Use LJ::is_enabled()
Use LJ::is_enabled() instead of $LJ::DISABLED. Then you don't have to use negative logic and can pass parameters to it. LJ::is_enabled wraps $LJ::DISABLED in conf_test().
Preferred variable usage of $LJ::HOME
$LJ::HOME/cgi-bin is preferred over $ENV{LJHOME}/cgi-bin -- use $LJ::HOME instead, and if you see the old form, feel free to change it.
However, some files, notably those in the bin/ directory, need to use the old form because the code file that defines $LJ::HOME isn't loaded. So, if you see something like this:
use lib "$ENV{LJHOME}/cgi-bin";
require 'ljlib.pl';
That's trying to bootstrap the system. Until after ljlib.pl is fully loaded, $LJ::HOME is empty/undef, so we are allowed to depend on the existence of the environment variable.
Use $u->get_cap
$u->get_cap is preferred over LJ::get_cap($u, ...).
But even better is adding the method as an accessor in LJ::User. Then you won't have to actually check caps every time.
Commenting
Please comment your new code as extensively as possible so that someone coming along behind you can follow what you're doing. If you're working with old code, and you find yourself having to puzzle through what various functions are doing, comment it up to indicate what you figure out. We want everything to be as understandable as possible for anyone who comes along after you. You don't need to comment every line, but anything of moderate complexity should have a line or two indicating what it does and why.
Code comments should be kept professional -- no profanity, no offensive or exclusionary language (in particular, the LJ code has a tendency to use the words "ghetto" or "lame" to indicate that something isn't as good as it should be, and we should absolutely avoid that and change it whenever we encounter it), no self-deprecating humor ("this is totally stupid of me but ..."), etc. This does not mean comments are required to be boring - but they should be informative, precise, and not offensive.
If you are refactoring code that contains language that does not meet with the Dreamwidth standards, please go ahead and change it as you refactor the code. We want people to be able to see our codebase and come away feeling that it's a clean and approachable system.
If you need to indicate that something will need attention again later, we've standardized on "FIXME" as our "fix me or add to me later" indicator. Always indicate why you're flagging something FIXME:
# FIXME: Works as-is, but needs foo, bar, and baz later.
# FIXME: Totally broken. Needs foo, bar, and baz to work again.
English stripping
Effective English stripping is a requirement! While we might not translate DW.org into other languages, other people who are using the DW code should be given the functionality to do so on their own sites. This means that you need to be conscious of how to effectively strip a page or file of the text in it.
Source file headers and credit
All new files you create should include the appropriate header in the file. You can see the list of sample headers for new dw-free files, new dw-nonfree files, and files that used to be 'livejournal' code but have been completely rewritten at File headers.