Export Journal Notes

From Dreamwidth Notes
Jump to: navigation, search

Design document for journal export function

Written by vlion (Paul Nathan)

DRAFT * DRAFT * DRAFT

COMMENTS APPRECIATED.

Contributors:

  • exor674[2]
  • Karen Wolf[3]


DRAFT * DRAFT * DRAFT

Introduction.

This is a technical document aimed to implement 'journal export to pdf'[3]. It is written with the understanding that the use cases in Design Personas[1] are valid representations of common cases of users.

This document will be split into several high-level areas. We will describe development notes, the backend, then the work flow, then the method by which the backend will implement the work flow, then graphic design considerations.

DRAFT * DRAFT * DRAFT

Development process

This is essentially a gigantic reporting application drawing upon DW's code.

To minimize complexity, the principle of YAGNI[8] will be applied to the backend routines.

We will output to LaTex[5] for these reasons.

  • Open Source
  • Stable[6]
  • Outputs to PDF, PS, DVI, and others[4]
  • Established typesetting system
  • Well-documented

Several streams of effort will need to be applied:

  • Graphic design of the typesetting style.
  • Programmatic effort to retrieve the requested information
  • Programmatic effort to display the interface
  • Programmatic effort to spawn the job

Several areas of expertise will need to be brought into play:

  • Graphic designing
  • Server admin/management
  • Programming


DRAFT * DRAFT * DRAFT

Backend

The high-level data-flow architecture of this system will look as follows:

DB -> Filter -> Reformatter -> LaTeX Generator -> LaTeX -> Output Interface

The DB->Output line is hereafter referred to as the Pipeline

The high-level control flow architecture of this system will look as follows:

Interface->Asynch Job start->Pipeline->notify users

The information block will be generically referred to as the database(DB), whether it is an actual SQL request, a memcached request, or the Perl module interfacing to the database.

The html interface will pass in the parameters to the asynchronous job and the pipeline will start.

The Filter module will receive parameters from the Interface via the asynch starter, request information from the database, do any required cleaning, (OPTIONAL: form it into a data-structure), and pass the information on to the Reformatter.

The Reformatter will analyse the information and determine if it needs reformatting. Example cases might be a super-deep comment thread, or images that need to be generated/resized. When the information is prepared, it is passed to the LaTeX Generator.

The LaTeX Generator builds a LaTeX file with the journal/comment/picture information inside. It is probable that Dreamwidth will need to implement custom environments to accomodate threaded comments. Links will be added as a bibliography section. When the Latex Generator is finished, it writes several files. out to the file system and begins the LaTeX process to generate the final file. BibTeX may be used for references, in which cases the usual multipass LaTeX-BibTeX-LaTeX sequence will be used in that process.

When the final file is prepared, it is zipped and moved to a location and the user is notified via DW-message and email where the location is.

dot format file

Security considerations:

  • Can an unauthorized user peek at the running process of another user?
  • Can an unauthorized user peek at the temporary files generated by LaTeX (and BibTeX)?
  • Can an unauthorized user view the final file?

DRAFT * DRAFT * DRAFT

Workflow

The 'export' function will be accessed via the 'create' menu found in $username.dreamwidth.org.

The Export will bring up an interface that will contain these questions:

  • Selection of (journal|community) to export
  • Selection of date range to export
  • Option to export userpics
  • Option to export images found in the journal
  • Output format selection (html|pdf|latex| ??? )
  • Output paper formats(numbered pages, font, paper size, ???)
  • The Export should also have the ability to export by tag, security-level, user-list.

The user will select the options they choose and press the big export button.

The user will be given an estimate of job completion time in the next page. At this point, the user will not be able to start a new export until the previous export has completed. Refreshing will not restart the job.

Loading the Exporter interface when the export job is running will bring up an estimate of when the job will be done.

When the job completes, the user will receive an email at their registered email account, as well as a notification in the inbox.

They will have a link to download the exported file (probably in ZIP). The file will be living on Dreamwidth's servers for some length of time(30 days?).

DRAFT * DRAFT * DRAFT

Workflow<->Backend

DRAFT * DRAFT * DRAFT

Graphic Design

  • Links will be added as a bibliography

DRAFT * DRAFT * DRAFT

Questions to be answered.

  • Deep(Horizontal) comment threads present a presentation issue that must be answered. Suggestions requested.
  • Option to have advanced users upload their own style?

DRAFT * DRAFT * DRAFT

Business Considerations

  • Paid-user only?
  • All users?
  • 1 year cooldown for free users?
  • Paid users get finer granularity?
  • Differentiate between paid users and paid communities?
  • Partnership with book printing group?

Hooks shall be placed in the code to enable the business considerations to be implemented and adjusted over time.

Copyright Concerns:

  • What are the copyright issues here?
  • How do other sites solve them?

References

[1] http://wiki.dwscoalition.org/notes/Design_Personas

[2] http://wiki.dwscoalition.org/notes/User:Exor674/Export_Braindump

[3] http://bugs.dwscoalition.org/attachment.cgi?id=112

[4] http://en.wikibooks.org/wiki/LaTeX/Export_To_Other_Formats

[5] http://www.latex-project.org/

[6] http://en.wikipedia.org/wiki/LaTeX

[7] http://dw-dev.dreamwidth.org/30837.html

[8] http://c2.com/xp/YouArentGonnaNeedIt.html