Difference between revisions of "Programming Guidelines"

From Dreamwidth Notes
Jump to: navigation, search
m (Error-handling: Actual comment)
 
(42 intermediate revisions by 18 users not shown)
Line 1: Line 1:
 
Initial starting point: [http://www.livejournal.com/doc/server/ljp.index.html LJ Programming Guidelines].
 
Initial starting point: [http://www.livejournal.com/doc/server/ljp.index.html LJ Programming Guidelines].
 +
 +
{{Note|text=There is a summary of these guidelines at [[Programming Guideline Checklist]].}}
  
 
Now, let's talk a bit about how Dreamwidth code should be written.  Note that this is the stylistic guideline to follow to make sure the code looks uniform.  If you submit a patch that doesn't follow these guidelines, then it will likely get declined with a note to review this documentation.  Realistically, we want everybody to be able to dive into the code and understand what is going on without having to try to interpret a dozen different coding styles.
 
Now, let's talk a bit about how Dreamwidth code should be written.  Note that this is the stylistic guideline to follow to make sure the code looks uniform.  If you submit a patch that doesn't follow these guidelines, then it will likely get declined with a note to review this documentation.  Realistically, we want everybody to be able to dive into the code and understand what is going on without having to try to interpret a dozen different coding styles.
Line 7: Line 9:
 
'''These are guidelines, and not to be afraid of!'''  Please don't feel intimidated by the style of the code.  People will help you out if you have questions or aren't sure how to write something.
 
'''These are guidelines, and not to be afraid of!'''  Please don't feel intimidated by the style of the code.  People will help you out if you have questions or aren't sure how to write something.
  
* Spaces.  Yes, we use spaces.  There are four per level of indentation.  But use your judgement when you're lining things up.
+
== Whitespace ==
  
* Use postfix conditionals.  That means:
+
=== Spaces for indentation ===
  
    # This is incorrect:
+
Yes, we use spaces: four per level of indentation.  But use your judgment when lining things up.
    if ( $something ) {
+
        do_something();
+
    }
+
   
+
    # Try this instead:
+
    do_something() if $something;
+
   
+
    # Or even:
+
    do_something()
+
        if $something;
+
  
It is your discretion which of the forms you use.  Pick whatever seems appropriate to the code.  (I tend to prefer the latter personally unless the line is really short.)
+
=== Spaces around parenthesis and operators ===
  
* Spaces around parenthesis and operators.  Perl has a well-deserved reputation for being line noise.  Let's try to avoid that.
+
Perl has a well-deserved reputation for being line noise.  Let's try to avoid that by putting spaces around parenthesis and operators:
  
    # This is right out:
+
<source lang="perl"># This is right out:
    if ((($a||$b)&&(($c>1)||$d&5))+1)>3) { ... }
+
if ((($a||$b)&&(($c>1)||$d&5))+1)>3) { ... }
   
+
 
    # This is much better:
+
# This is much better:
    if ( ( ( $a || $b ) && ( ( $c > 1 ) || ( $d & 5 ) ) + 1 ) > 3 ) { ... }
+
if ( ( ( $a || $b ) && ( ( $c > 1 ) || ( $d & 5 ) ) + 1 ) > 3 ) { ... }</source>
  
 
Well.  It's better, but this line is terrible anyway.  If you end up with a line like this, I'd suggest rewriting it entirely.  Don't be too afraid of going onto multiple lines if it makes the operation more readable.
 
Well.  It's better, but this line is terrible anyway.  If you end up with a line like this, I'd suggest rewriting it entirely.  Don't be too afraid of going onto multiple lines if it makes the operation more readable.
  
The "space around parenthesis" rule also applies to method calls.  Here are some examples of when this comes into play:
+
The "spaces around parenthesis" rule also applies to method calls.  Here are some examples of when this comes into play:
  
    call_method( $arg, $arg2 );
+
<source lang="perl">call_method( $arg, $arg2 );
   
+
    do_something( 45 ) if $org || $neg;
+
  
However, do NOT use parenthesis around operators used to index into arrays or hashes.  It gets confusing.
+
do_something( 45 ) if $org || $neg;</source>
  
    # Wrong:
+
However, do NOT use spaces around operators used to index into arrays or hashes. It gets confusing.
    if ( $array[ 13 ] > $hash->{ $index } ) { ... }
+
   
+
    # Right:
+
    if ( $array[13] > $hash->{$index} ) { ... }
+
  
* Don't use extraneous parenthesis. This happens mostly on method calls that don't need it and postfix conditionals.
+
<source lang="perl"># Wrong:
 +
if ( $array[ 13 ] > $hash->{ $index } ) { ... }
  
    # These are wrong:
+
# Right:
    my $r = DW::Request->get();
+
if ( $array[13] > $hash->{$index} ) { ... }</source>
    do_something() if ( $r->method() eq 'POST' );
+
   
+
    # Fixed:
+
    my $r = DW::Request->get;
+
    do_something() if $r->method eq 'POST';
+
  
Note that you need parenthesis after bareword function calls such as the do_something() above.  Otherwise Perl has no idea it's a method and not a constant or some other value.  But when you call it on a reference (using the -> operator) then it's assumed to be a method.
+
=== No tabs or trailing whitespace ===
  
* No tabs, ever.  No trailing whitespace.
+
No tabs, ever.  No trailing whitespace.
  
These things happen.  I'll try to catch them before commit.  But don't commit tabs or trailing whitespace.  Convert all tabs to spaces and prune trailing whitespace.
+
These things happen.  The people reviewing code will try to catch them before commit.  But don't commit tabs or trailing whitespace.  Convert all tabs to spaces and prune trailing whitespace.
  
* Use human logic.  This is a big one that will really help people reading the code.  The human brain is good at some things, bad at others.
+
Tabs are not consistent between all editors and operating systems.
  
     # if something is true is easy to understand:
+
=== Linebreak readably ===
    if ( $whatever ) { ... }
+
 
 +
Long lines are difficult to read. Lines 72 or fewer characters long are ideal; 80-character lines are acceptable; lines over 120 characters long should probably be rewritten.
 +
 
 +
== Don't use extraneous parentheses ==
 +
 
 +
Don't use extraneous parentheses.  This happens mostly on method calls that don't need it and postfix conditionals.
 +
 
 +
<source lang="perl"># These are wrong:
 +
my $r = DW::Request->get();
 +
do_something() if ( $r->method() eq 'POST' );
 +
 
 +
# Fixed:
 +
my $r = DW::Request->get;
 +
do_something() if $r->method eq 'POST';</source>
 +
 
 +
Note that you need parentheses after bareword function calls such as the do_something() above.  Otherwise Perl has no idea it's a method and not a constant or some other value.  But when you call it on a reference (using the -> operator) then it's assumed to be a method.
 +
 
 +
== Logic ==
 +
 
 +
=== Postfix conditionals ===
 +
 
 +
Use postfix conditionals.  That means:
 +
 
 +
<source lang="perl"># This is incorrect:
 +
if ( $something ) {
 +
     do_something();
 +
}
 +
 
 +
# Try this instead:
 +
do_something() if $something;
 +
 
 +
# Or even:
 +
do_something()
 +
    if $something;</source>
 +
 
 +
It is your discretion which of the forms you use.  Pick whatever seems appropriate to the code.  (I tend to prefer the latter personally unless the line is really short.)
 +
 
 +
'''EXCEPTION:''' Do not combine postfix conditionals with "my" statements, as subsequent code can cause the variable to be accidentally assigned in global scope.  Perl's strict and warnings tests do not detect this.
 +
 
 +
<source lang="perl"># use at your own risk
 +
my $foo = 'bar' if $u->is_special;
 +
 
 +
# preferred alternative
 +
my $foo;
 +
$foo = 'bar' if $u->is_special;</source>
 +
 
 +
=== Use human logic ===
 +
 
 +
This is a big one that will really help people reading the code.  The human brain is good at some things, bad at others.
 +
 
 +
<source lang="perl"># if something is true is easy to understand:
 +
if ( $whatever ) { ... }
 
      
 
      
    # unless something is true is also easy to understand:
+
# unless something is true is also easy to understand:
    unless ( $whatever ) { ... }
+
unless ( $whatever ) { ... }</source>
  
 
However, some things are NOT easy to understand, and should be rewritten.
 
However, some things are NOT easy to understand, and should be rewritten.
  
    # don't use this unless you REALLY need to
+
<source lang="perl"># don't use this unless you REALLY need to
    if ( ! $something ) { ... }
+
if ( ! $something ) { ... }
   
+
 
    # use this instead:
+
# use this instead:
    unless ( $something ) { ... }
+
unless ( $something ) { ... }</source>
  
 
In some cases, the variable in question lends itself to the negation syntax in the first example.  But that's a rare case.  Just think about it for a second and go with what flows.
 
In some cases, the variable in question lends itself to the negation syntax in the first example.  But that's a rare case.  Just think about it for a second and go with what flows.
  
    # never use this
+
<source lang="perl"># never use this
    unless ( ! $foobar ) { ... }
+
unless ( ! $foobar ) { ... }
 +
 
 +
# or this
 +
unless ( $foobar ) { ... } else { ... }</source>
  
 
There is no situation in which "unless not $something" makes sense.  The brain just doesn't parse it.  It's a double negative, you might as well just use "if $something" and call it good!
 
There is no situation in which "unless not $something" makes sense.  The brain just doesn't parse it.  It's a double negative, you might as well just use "if $something" and call it good!
 +
 +
== Data structures ==
 +
 +
=== Proper use of hashes ===
 +
 +
Proper use of hashes is important.  Do not quote literals unless needed.  That may not make sense, so here are some examples:
 +
 +
<source lang="perl"># This is bad
 +
$HASH{'DATA'} = 5;
 +
$hashref->{'Something'} = $foo;
 +
 +
# try these instead
 +
$HASH{DATA} = 5;
 +
$hashref->{Something} = $foo;</source>
 +
 +
The only time that you need quotes is if the value you are quoting has non-word characters in it.  I.e., dashes, quotes, etc.
 +
 +
===Array length===
 +
 +
The correct way to measure the length of an array is to use the "scalar" built-in function.  The special variable $#arrayname is sometimes used for this purpose, but it returns the index of the last element of the array; thus the length of the array is always $#arrayname + 1.  Since this inevitably leads to mistakes, it's best to avoid using that syntax entirely.
 +
 +
<source lang="perl">my @array = qw( apples oranges bananas );
 +
print scalar @array;  # returns 3, easy to understand
 +
print $#array + 1; # returns 3, hard to understand</source>
 +
 +
=== Local variable assignment ===
 +
 +
It has been observed that code performance suffers if multiple calls to the "shift" operator are executed.  Best practice is to use list assignment instead.
 +
 +
<source lang="perl"># Multiple shifts: bad
 +
my $var_1 = shift;
 +
my $var_2 = shift;
 +
my $var_3 = shift;
 +
 +
# List assignment: good
 +
my ( $var_1, $var_2, $var_3 ) = @_;</source>
 +
 +
For very short functions (5 lines or less) it is reasonable to refer to the arguments array directly (using $_[0] et al) and avoid local variables entirely.
 +
 +
See also: [[Optimizing code]].
 +
 +
== Error-handling ==
 +
 +
There are a number of different methods of dealing with errors in the codebase. Broadly, these fall into the categories of "things that should return a polite error and carry on" and "things that can give an error message that won't make sense to most users".
 +
For the latter, there are still a few different methods about, and if making minor changes to existing code you should probably respect the method that is used in that code. But for new code, use <tt>die</tt> or <tt>Carp::croak</tt>. See [https://dw-dev-training.dreamwidth.org/53222.html?thread=360934#cmt360934 this comment] from Mark for a full explanation.
 +
 +
== LJ Specific Conventions ==
 +
 +
=== Use LJ::is_enabled() ===
 +
 +
Use <tt>LJ::is_enabled()</tt> instead of <tt>$LJ::DISABLED</tt>.  Then you don't have to use negative logic and can pass parameters to it. <tt>LJ::is_enabled</tt> wraps <tt>$LJ::DISABLED</tt> in <tt>conf_test()</tt>.
 +
 +
=== Preferred variable usage of $LJ::HOME ===
 +
 +
<tt>$LJ::HOME/cgi-bin</tt> is preferred over <tt>$ENV{LJHOME}/cgi-bin</tt> -- use <tt>$LJ::HOME</tt> instead, and if you see the old form, feel free to change it.
 +
 +
However, some files, notably those in the bin/ directory, need to use the old form because the code file that defines <tt>$LJ::HOME</tt> isn't loaded. So, if you see something like this:
 +
 +
<tt>use lib "$ENV{LJHOME}/cgi-bin";</tt><br />
 +
<tt>require 'ljlib.pl';</tt>
 +
 +
That's trying to bootstrap the system. Until after ljlib.pl is fully loaded, <tt>$LJ::HOME</tt> is empty/undef, so we are allowed to depend on the existence of the environment variable.
 +
 +
=== Use $u->get_cap ===
 +
 +
<tt>$u->get_cap</tt> is preferred over <tt>LJ::get_cap($u, ...)</tt>.
 +
 +
But even better is adding the method as an accessor in LJ::User.  Then you won't have to actually check caps every time.
 +
 +
== Commenting ==
 +
 +
Please comment your new code as extensively as possible so that someone coming along behind you can follow what you're doing. If you're working with old code, and you find yourself having to puzzle through what various functions are doing, comment it up to indicate what you figure out. We want everything to be as understandable as possible for anyone who comes along after you. You don't need to comment every line, but anything of moderate complexity should have a line or two indicating what it does and why.
 +
 +
Code comments should be kept professional -- no profanity, no offensive or exclusionary language (in particular, the LJ code has a tendency to use the words "ghetto" or "lame" to indicate that something isn't as good as it should be, and we should absolutely avoid that and change it whenever we encounter it), no self-deprecating humor ("this is totally stupid of me but ..."), etc. This does not mean comments are required to be boring - but they should be informative, precise, and not offensive.
 +
 +
If you are refactoring code that contains language that does not meet with the Dreamwidth standards, please go ahead and change it as you refactor the code.  We want people to be able to see our codebase and come away feeling that it's a clean and approachable system.
 +
 +
If you need to indicate that something will need attention again later, we've standardized on "FIXME" as our "fix me or add to me later" indicator. Always indicate why you're flagging something FIXME:
 +
 +
<source lang="perl"># FIXME: Works as-is, but needs foo, bar, and baz later.</source>
 +
 +
<source lang="perl"># FIXME: Totally broken. Needs foo, bar, and baz to work again.</source>
 +
 +
== English stripping ==
 +
 +
Effective [[English-stripping|English stripping]] is a requirement!  While we might not translate DW.org into other languages, other people who are using the DW code should be given the functionality to do so on their own sites.  This means that you need to be conscious of how to effectively strip a page or file of the text in it.
 +
 +
== Source file headers and credit ==
 +
 +
All new files you create should include the appropriate header in the file. You can see the list of sample headers for new dw-free files, new dw-nonfree files, and files that used to be 'livejournal' code but have been completely rewritten at [[File headers]].
  
 
[[Category: Development]]
 
[[Category: Development]]

Latest revision as of 17:33, 10 May 2017

Initial starting point: LJ Programming Guidelines.

Note: There is a summary of these guidelines at Programming Guideline Checklist.

Now, let's talk a bit about how Dreamwidth code should be written. Note that this is the stylistic guideline to follow to make sure the code looks uniform. If you submit a patch that doesn't follow these guidelines, then it will likely get declined with a note to review this documentation. Realistically, we want everybody to be able to dive into the code and understand what is going on without having to try to interpret a dozen different coding styles.

Note that these are not hard and fast rules. A lot of the time you will be fixing up code somewhere that is broken, and you will want to follow the style of the code you are working in. If you're editing really old pages like talkread.bml then you should attempt to code legibly, but follow the style in use on the page. (Mixing styles within a file can be problematic.)

These are guidelines, and not to be afraid of! Please don't feel intimidated by the style of the code. People will help you out if you have questions or aren't sure how to write something.

Whitespace

Spaces for indentation

Yes, we use spaces: four per level of indentation. But use your judgment when lining things up.

Spaces around parenthesis and operators

Perl has a well-deserved reputation for being line noise. Let's try to avoid that by putting spaces around parenthesis and operators:

# This is right out:
if ((($a||$b)&&(($c>1)||$d&5))+1)>3) { ... }
 
# This is much better:
if ( ( ( $a || $b ) && ( ( $c > 1 ) || ( $d & 5 ) ) + 1 ) > 3 ) { ... }

Well. It's better, but this line is terrible anyway. If you end up with a line like this, I'd suggest rewriting it entirely. Don't be too afraid of going onto multiple lines if it makes the operation more readable.

The "spaces around parenthesis" rule also applies to method calls. Here are some examples of when this comes into play:

call_method( $arg, $arg2 );
 
do_something( 45 ) if $org || $neg;

However, do NOT use spaces around operators used to index into arrays or hashes. It gets confusing.

# Wrong:
if ( $array[ 13 ] > $hash->{ $index } ) { ... }
 
# Right:
if ( $array[13] > $hash->{$index} ) { ... }

No tabs or trailing whitespace

No tabs, ever. No trailing whitespace.

These things happen. The people reviewing code will try to catch them before commit. But don't commit tabs or trailing whitespace. Convert all tabs to spaces and prune trailing whitespace.

Tabs are not consistent between all editors and operating systems.

Linebreak readably

Long lines are difficult to read. Lines 72 or fewer characters long are ideal; 80-character lines are acceptable; lines over 120 characters long should probably be rewritten.

Don't use extraneous parentheses

Don't use extraneous parentheses. This happens mostly on method calls that don't need it and postfix conditionals.

# These are wrong:
my $r = DW::Request->get();
do_something() if ( $r->method() eq 'POST' );
 
# Fixed:
my $r = DW::Request->get;
do_something() if $r->method eq 'POST';

Note that you need parentheses after bareword function calls such as the do_something() above. Otherwise Perl has no idea it's a method and not a constant or some other value. But when you call it on a reference (using the -> operator) then it's assumed to be a method.

Logic

Postfix conditionals

Use postfix conditionals. That means:

# This is incorrect:
if ( $something ) {
    do_something();
}
 
# Try this instead:
do_something() if $something;
 
# Or even:
do_something()
    if $something;

It is your discretion which of the forms you use. Pick whatever seems appropriate to the code. (I tend to prefer the latter personally unless the line is really short.)

EXCEPTION: Do not combine postfix conditionals with "my" statements, as subsequent code can cause the variable to be accidentally assigned in global scope. Perl's strict and warnings tests do not detect this.

# use at your own risk
my $foo = 'bar' if $u->is_special;
 
# preferred alternative
my $foo;
$foo = 'bar' if $u->is_special;

Use human logic

This is a big one that will really help people reading the code. The human brain is good at some things, bad at others.

# if something is true is easy to understand:
if ( $whatever ) { ... }
 
# unless something is true is also easy to understand:
unless ( $whatever ) { ... }

However, some things are NOT easy to understand, and should be rewritten.

# don't use this unless you REALLY need to
if ( ! $something ) { ... }
 
# use this instead:
unless ( $something ) { ... }

In some cases, the variable in question lends itself to the negation syntax in the first example. But that's a rare case. Just think about it for a second and go with what flows.

# never use this
unless ( ! $foobar ) { ... }
 
# or this
unless ( $foobar ) { ... } else { ... }

There is no situation in which "unless not $something" makes sense. The brain just doesn't parse it. It's a double negative, you might as well just use "if $something" and call it good!

Data structures

Proper use of hashes

Proper use of hashes is important. Do not quote literals unless needed. That may not make sense, so here are some examples:

# This is bad
$HASH{'DATA'} = 5;
$hashref->{'Something'} = $foo;
 
# try these instead
$HASH{DATA} = 5;
$hashref->{Something} = $foo;

The only time that you need quotes is if the value you are quoting has non-word characters in it. I.e., dashes, quotes, etc.

Array length

The correct way to measure the length of an array is to use the "scalar" built-in function. The special variable $#arrayname is sometimes used for this purpose, but it returns the index of the last element of the array; thus the length of the array is always $#arrayname + 1. Since this inevitably leads to mistakes, it's best to avoid using that syntax entirely.

my @array = qw( apples oranges bananas );
print scalar @array;  # returns 3, easy to understand
print $#array + 1; # returns 3, hard to understand

Local variable assignment

It has been observed that code performance suffers if multiple calls to the "shift" operator are executed. Best practice is to use list assignment instead.

# Multiple shifts: bad
my $var_1 = shift;
my $var_2 = shift;
my $var_3 = shift;
 
# List assignment: good
my ( $var_1, $var_2, $var_3 ) = @_;

For very short functions (5 lines or less) it is reasonable to refer to the arguments array directly (using $_[0] et al) and avoid local variables entirely.

See also: Optimizing code.

Error-handling

There are a number of different methods of dealing with errors in the codebase. Broadly, these fall into the categories of "things that should return a polite error and carry on" and "things that can give an error message that won't make sense to most users". For the latter, there are still a few different methods about, and if making minor changes to existing code you should probably respect the method that is used in that code. But for new code, use die or Carp::croak. See this comment from Mark for a full explanation.

LJ Specific Conventions

Use LJ::is_enabled()

Use LJ::is_enabled() instead of $LJ::DISABLED. Then you don't have to use negative logic and can pass parameters to it. LJ::is_enabled wraps $LJ::DISABLED in conf_test().

Preferred variable usage of $LJ::HOME

$LJ::HOME/cgi-bin is preferred over $ENV{LJHOME}/cgi-bin -- use $LJ::HOME instead, and if you see the old form, feel free to change it.

However, some files, notably those in the bin/ directory, need to use the old form because the code file that defines $LJ::HOME isn't loaded. So, if you see something like this:

use lib "$ENV{LJHOME}/cgi-bin";
require 'ljlib.pl';

That's trying to bootstrap the system. Until after ljlib.pl is fully loaded, $LJ::HOME is empty/undef, so we are allowed to depend on the existence of the environment variable.

Use $u->get_cap

$u->get_cap is preferred over LJ::get_cap($u, ...).

But even better is adding the method as an accessor in LJ::User. Then you won't have to actually check caps every time.

Commenting

Please comment your new code as extensively as possible so that someone coming along behind you can follow what you're doing. If you're working with old code, and you find yourself having to puzzle through what various functions are doing, comment it up to indicate what you figure out. We want everything to be as understandable as possible for anyone who comes along after you. You don't need to comment every line, but anything of moderate complexity should have a line or two indicating what it does and why.

Code comments should be kept professional -- no profanity, no offensive or exclusionary language (in particular, the LJ code has a tendency to use the words "ghetto" or "lame" to indicate that something isn't as good as it should be, and we should absolutely avoid that and change it whenever we encounter it), no self-deprecating humor ("this is totally stupid of me but ..."), etc. This does not mean comments are required to be boring - but they should be informative, precise, and not offensive.

If you are refactoring code that contains language that does not meet with the Dreamwidth standards, please go ahead and change it as you refactor the code. We want people to be able to see our codebase and come away feeling that it's a clean and approachable system.

If you need to indicate that something will need attention again later, we've standardized on "FIXME" as our "fix me or add to me later" indicator. Always indicate why you're flagging something FIXME:

# FIXME: Works as-is, but needs foo, bar, and baz later.
# FIXME: Totally broken. Needs foo, bar, and baz to work again.

English stripping

Effective English stripping is a requirement! While we might not translate DW.org into other languages, other people who are using the DW code should be given the functionality to do so on their own sites. This means that you need to be conscious of how to effectively strip a page or file of the text in it.

Source file headers and credit

All new files you create should include the appropriate header in the file. You can see the list of sample headers for new dw-free files, new dw-nonfree files, and files that used to be 'livejournal' code but have been completely rewritten at File headers.