Difference between revisions of "Export Journal Notes"

From Dreamwidth Notes
Jump to: navigation, search
m
m (Questions to be answered.)
 
(18 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 
Design document for journal export function
 
Design document for journal export function
 +
 
Written by vlion (Paul Nathan)
 
Written by vlion (Paul Nathan)
  
Line 13: Line 14:
 
DRAFT * DRAFT * DRAFT
 
DRAFT * DRAFT * DRAFT
  
1. Introduction.
+
= Introduction. =
  
This is a technical documented aimed to implement 'journal export to
+
This is a technical document aimed to implement 'journal export to
 
pdf'[3]. It is written with the understanding that the use cases in
 
pdf'[3]. It is written with the understanding that the use cases in
 
Design Personas[1] are valid representations of common cases of users.
 
Design Personas[1] are valid representations of common cases of users.
  
This document will be split into several high-level areas. We will
+
This document will be split into several high-level areas. We will describe development notes, the backend, then the work flow, then the method by which the
describe development notes, the backend, then the work flow, then the method by which the
+
 
backend will implement the work flow, then graphic design considerations.
 
backend will implement the work flow, then graphic design considerations.
  
 
DRAFT * DRAFT * DRAFT
 
DRAFT * DRAFT * DRAFT
  
2. Development process
+
= Development process =
  
 
This is essentially a gigantic reporting application drawing upon DW's
 
This is essentially a gigantic reporting application drawing upon DW's
 
code.
 
code.
  
To minimize complexity, the principle of YAGNI will be applied to the backend routines.
+
To minimize complexity, the principle of YAGNI[8] will be applied to the backend routines.
  
 
We will output to LaTex[5] for these reasons.
 
We will output to LaTex[5] for these reasons.
Line 44: Line 44:
 
* Programmatic effort to display the interface
 
* Programmatic effort to display the interface
 
* Programmatic effort to spawn the job
 
* Programmatic effort to spawn the job
 +
 +
Several areas of expertise will need to be brought into play:
 +
* Graphic designing
 +
* Server admin/management
 +
* Programming
 +
  
  
 
DRAFT * DRAFT * DRAFT
 
DRAFT * DRAFT * DRAFT
  
3. Backend
+
= Backend =
  
 
The high-level data-flow architecture of this system will look as follows:
 
The high-level data-flow architecture of this system will look as follows:
  
DB -> Filter -> Reformatter -> Latex Generator -> Latex -> Output
+
DB -> Filter -> Reformatter -> LaTeX Generator -> LaTeX -> Output
Interface->Filter
+
Interface
  
 
The DB->Output line is hereafter referred to as the Pipeline
 
The DB->Output line is hereafter referred to as the Pipeline
Line 77: Line 83:
 
reformatting. Example cases might be a super-deep comment thread, or
 
reformatting. Example cases might be a super-deep comment thread, or
 
images that need to be generated/resized. When the information is
 
images that need to be generated/resized. When the information is
prepared, it is passed to the Latex Generator.
+
prepared, it is passed to the LaTeX Generator.
  
The Latex Generator builds a LaTex file with the
+
The LaTeX Generator builds a LaTeX file with the
 
journal/comment/picture information inside. It is probable that
 
journal/comment/picture information inside. It is probable that
 
Dreamwidth will need to implement custom environments to accomodate
 
Dreamwidth will need to implement custom environments to accomodate
 
threaded comments. Links will be added as a bibliography section.
 
threaded comments. Links will be added as a bibliography section.
 
When the Latex Generator is finished, it writes several files.  out to
 
When the Latex Generator is finished, it writes several files.  out to
the file system and begins the LaTeX/BibTex process to generate the
+
the file system and begins the LaTeX process to generate the
final file.
+
final file.  BibTeX may be used for references, in which cases the usual
 +
multipass LaTeX-BibTeX-LaTeX sequence will be used in that process.
  
 
When the final file is prepared, it is zipped and moved to a location and the user is
 
When the final file is prepared, it is zipped and moved to a location and the user is
 
notified via DW-message and email where the location is.
 
notified via DW-message and email where the location is.
 +
 +
[[architecture for the export journal in dot|dot format file]]
  
 
Security considerations:
 
Security considerations:
 
* Can an unauthorized user peek at the running process of another user?
 
* Can an unauthorized user peek at the running process of another user?
* Can an unauthorized user peek at the temporary files generated by Latex?
+
* Can an unauthorized user peek at the temporary files generated by LaTeX (and BibTeX)?
 
* Can an unauthorized user view the final file?
 
* Can an unauthorized user view the final file?
  
 
DRAFT * DRAFT * DRAFT
 
DRAFT * DRAFT * DRAFT
  
4. Workflow
+
= Workflow =
  
 
The 'export' function will be accessed via the 'create' menu found in
 
The 'export' function will be accessed via the 'create' menu found in
Line 110: Line 119:
 
* Output format selection (html|pdf|latex| ??? )
 
* Output format selection (html|pdf|latex| ??? )
 
* Output paper formats(numbered pages, font, paper size, ???)
 
* Output paper formats(numbered pages, font, paper size, ???)
* The Export should also have the ability to export by tag,
+
* The Export should also have the ability to export by tag, security-level, user-list.  
security-level, user-list.  
+
  
 
The user will select the options they choose and press the big export
 
The user will select the options they choose and press the big export
 
button.  
 
button.  
  
The user will be given an estimate of job completion time in the next
+
The user will be given an estimate of job completion time in the next page. At this point, the user will not be able to start a new export until the previous export has completed. Refreshing will not restart the job.  
page. At this point, the Exporter will be finished for some time for
+
this user(5 minutes?). Refreshing will not restart the job.  
+
  
When the job completes, the user will receive an email at their
+
Loading the Exporter interface when the export job is running will bring up an estimate of when the job will be done.  
registered email account, as well as a notification in the inbox.  
+
  
They will have a link to download the exported file (probably in
+
When the job completes, the user will receive an email at their registered email account, as well as a notification in the inbox.  
ZIP). The file will be living on Dreamwidth's servers for some length
+
of time(30 days?).
+
  
DRAFT * DRAFT * DRAFT
+
They will have a link to download the exported file (probably in ZIP). The file will be living on Dreamwidth's servers for some length of time(30 days?).
  
5. Workflow<->Backend
+
DRAFT * DRAFT * DRAFT
  
 +
= Workflow<->Backend =
 
DRAFT * DRAFT * DRAFT
 
DRAFT * DRAFT * DRAFT
  
6. Graphic Design  
+
= Graphic Design =
  
 
* Links will be added as a bibliography
 
* Links will be added as a bibliography
Line 139: Line 143:
 
DRAFT * DRAFT * DRAFT
 
DRAFT * DRAFT * DRAFT
  
7. Questions to be answered.
+
= Questions to be answered. =
  
* Deep(Horizontal) comment threads present a presentation issue that must be answered
+
* Deep(Horizontal) comment threads present a presentation issue that must be answered. '''Suggestions requested'''.
  Suggestions requested.
+
 
* Option to have advanced users upload their own style?
 
* Option to have advanced users upload their own style?
  
 
DRAFT * DRAFT * DRAFT
 
DRAFT * DRAFT * DRAFT
  
References:
+
= Business Considerations =
 +
 
 +
* Paid-user only?
 +
* All users?
 +
* 1 year cooldown for free users?
 +
* Paid users get finer granularity?
 +
* Differentiate between paid users and paid communities?
 +
* Partnership with book printing group?
 +
 
 +
Hooks shall be placed in the code to enable the business considerations to be implemented and adjusted over time.
 +
 
 +
== Copyright Concerns: ==
 +
 
 +
* What are the copyright issues here?
 +
* How do other sites solve them?
 +
 
 +
= References =
 +
 
 
[1] http://wiki.dwscoalition.org/notes/Design_Personas
 
[1] http://wiki.dwscoalition.org/notes/Design_Personas
  
Line 161: Line 181:
  
 
[7] http://dw-dev.dreamwidth.org/30837.html
 
[7] http://dw-dev.dreamwidth.org/30837.html
 +
 +
[8] http://c2.com/xp/YouArentGonnaNeedIt.html

Latest revision as of 17:35, 10 December 2009

Design document for journal export function

Written by vlion (Paul Nathan)

DRAFT * DRAFT * DRAFT

COMMENTS APPRECIATED.

Contributors:

  • exor674[2]
  • Karen Wolf[3]


DRAFT * DRAFT * DRAFT

Introduction.

This is a technical document aimed to implement 'journal export to pdf'[3]. It is written with the understanding that the use cases in Design Personas[1] are valid representations of common cases of users.

This document will be split into several high-level areas. We will describe development notes, the backend, then the work flow, then the method by which the backend will implement the work flow, then graphic design considerations.

DRAFT * DRAFT * DRAFT

Development process

This is essentially a gigantic reporting application drawing upon DW's code.

To minimize complexity, the principle of YAGNI[8] will be applied to the backend routines.

We will output to LaTex[5] for these reasons.

  • Open Source
  • Stable[6]
  • Outputs to PDF, PS, DVI, and others[4]
  • Established typesetting system
  • Well-documented

Several streams of effort will need to be applied:

  • Graphic design of the typesetting style.
  • Programmatic effort to retrieve the requested information
  • Programmatic effort to display the interface
  • Programmatic effort to spawn the job

Several areas of expertise will need to be brought into play:

  • Graphic designing
  • Server admin/management
  • Programming


DRAFT * DRAFT * DRAFT

Backend

The high-level data-flow architecture of this system will look as follows:

DB -> Filter -> Reformatter -> LaTeX Generator -> LaTeX -> Output Interface

The DB->Output line is hereafter referred to as the Pipeline

The high-level control flow architecture of this system will look as follows:

Interface->Asynch Job start->Pipeline->notify users

The information block will be generically referred to as the database(DB), whether it is an actual SQL request, a memcached request, or the Perl module interfacing to the database.

The html interface will pass in the parameters to the asynchronous job and the pipeline will start.

The Filter module will receive parameters from the Interface via the asynch starter, request information from the database, do any required cleaning, (OPTIONAL: form it into a data-structure), and pass the information on to the Reformatter.

The Reformatter will analyse the information and determine if it needs reformatting. Example cases might be a super-deep comment thread, or images that need to be generated/resized. When the information is prepared, it is passed to the LaTeX Generator.

The LaTeX Generator builds a LaTeX file with the journal/comment/picture information inside. It is probable that Dreamwidth will need to implement custom environments to accomodate threaded comments. Links will be added as a bibliography section. When the Latex Generator is finished, it writes several files. out to the file system and begins the LaTeX process to generate the final file. BibTeX may be used for references, in which cases the usual multipass LaTeX-BibTeX-LaTeX sequence will be used in that process.

When the final file is prepared, it is zipped and moved to a location and the user is notified via DW-message and email where the location is.

dot format file

Security considerations:

  • Can an unauthorized user peek at the running process of another user?
  • Can an unauthorized user peek at the temporary files generated by LaTeX (and BibTeX)?
  • Can an unauthorized user view the final file?

DRAFT * DRAFT * DRAFT

Workflow

The 'export' function will be accessed via the 'create' menu found in $username.dreamwidth.org.

The Export will bring up an interface that will contain these questions:

  • Selection of (journal|community) to export
  • Selection of date range to export
  • Option to export userpics
  • Option to export images found in the journal
  • Output format selection (html|pdf|latex| ??? )
  • Output paper formats(numbered pages, font, paper size, ???)
  • The Export should also have the ability to export by tag, security-level, user-list.

The user will select the options they choose and press the big export button.

The user will be given an estimate of job completion time in the next page. At this point, the user will not be able to start a new export until the previous export has completed. Refreshing will not restart the job.

Loading the Exporter interface when the export job is running will bring up an estimate of when the job will be done.

When the job completes, the user will receive an email at their registered email account, as well as a notification in the inbox.

They will have a link to download the exported file (probably in ZIP). The file will be living on Dreamwidth's servers for some length of time(30 days?).

DRAFT * DRAFT * DRAFT

Workflow<->Backend

DRAFT * DRAFT * DRAFT

Graphic Design

  • Links will be added as a bibliography

DRAFT * DRAFT * DRAFT

Questions to be answered.

  • Deep(Horizontal) comment threads present a presentation issue that must be answered. Suggestions requested.
  • Option to have advanced users upload their own style?

DRAFT * DRAFT * DRAFT

Business Considerations

  • Paid-user only?
  • All users?
  • 1 year cooldown for free users?
  • Paid users get finer granularity?
  • Differentiate between paid users and paid communities?
  • Partnership with book printing group?

Hooks shall be placed in the code to enable the business considerations to be implemented and adjusted over time.

Copyright Concerns:

  • What are the copyright issues here?
  • How do other sites solve them?

References

[1] http://wiki.dwscoalition.org/notes/Design_Personas

[2] http://wiki.dwscoalition.org/notes/User:Exor674/Export_Braindump

[3] http://bugs.dwscoalition.org/attachment.cgi?id=112

[4] http://en.wikibooks.org/wiki/LaTeX/Export_To_Other_Formats

[5] http://www.latex-project.org/

[6] http://en.wikipedia.org/wiki/LaTeX

[7] http://dw-dev.dreamwidth.org/30837.html

[8] http://c2.com/xp/YouArentGonnaNeedIt.html