Difference between revisions of "Importing"

From Dreamwidth Notes
Jump to: navigation, search
m (fixing bad formatting)
(Comments import: added an error)
 
(17 intermediate revisions by 8 users not shown)
Line 1: Line 1:
== Start ==  
+
== TODO ==  
  
Basic download script base:
+
* Get it compiling after some changes merged in from test tool.
 
+
* Userpics
* http://codex.journal-press.com/Ljdump
+
** Need to fail cleaner if this is missing (like on DJ)
 
+
* Entries '''Almost done'''
I don't know anything about scripting, so I don't know if either of the below would be better or worse for our purposes. But
+
** Probably need to scrub a few more things.
 
+
  <lj-template name="qotd" id="..." /> for instance
* [http://www.kelpheavyweaponry.com/trac/ljmigrate/wiki/WikiStart ljmigrate] is based on the ljdump code, captures usericons as well, and has the ability to import to lj-based systems.
+
* Friends
* [http://ljsm.feechki.org/index_en.html ljsm] is a python script which captures usericons and posted images, in addition to entries and comments.
+
** I think this is wanted, import them as OpenID?
 
+
* Comments
== Examples ==
+
** I need to find a proper "insert comment '''and''' metadata in the same instant" command, as I don't want a possible instant where someone can see a comment that is supposed to be screened.
 
+
Good to look at Schwartz workers for reference.  Possibilities:
+
 
+
* Worker for privacy conversion -- long running task limited to not kill the DB.  bin/worker/process-privacy, LJ::MassPrivacy
+
* bin/worker/paidstatus - the schwartz 'shell', that declares the job. it imports the module DW::Worker::PaidStatus
+
* cgi-bin/DW/Worker/PaidStatus.pm - the actual worker; jobs can either succeed, fail permanently, fail temporarily.  note the helper subs at the bottom of that module that define the schwartz parameters for retries, etc.
+
* cgi-bin/DW/Worker/Payment.pm - search for PaidStatus, line 90 -- you'll see the calling convention for creating a new schwartz job and passing arguments
+
* bin/test-pay  - a generic script for testing the payment system
+
 
+
== Rough Implementation Guide ==
+
 
+
# create a new DW::Worker::ContentDownloader module (or something, name doesn't matter much)
+
# create the simple bin/worker/contentdownloader that wraps the module; this is almost a carbon copy of existing bin/worker/paidstatus etc
+
# create a test script that lets you insert a job. the test script can be really simple, something like: <code>bin/test-content-downloader --insert-job xb95</code>, and then the script would have four lines of code, one of which is that line 90 from Payment.pm referenced above, i.e., create the job, give it the arguments
+
# then, you can test that your pipeline works by running the worker in debug mode... um, <code>bin/worker/contentdownloader --debug</code> I think. You probably have to do <code>perl -I/home/dw/cgi-bin bin/worker/contentdownloader --debug</code> to get the include path setup right
+
# anyway, once all that works - your job is working, your module is getting called, etc, then you can pretty easily drop in a call out to the download script. system("/path/to/python", "/path/to/script", "--username", "xb95"); and then worry about error checking, timeout checking, and all the million failure cases
+
 
+
=== Ideal Process ===
+
 
+
# run the content downloader,
+
# copy data to mogile
+
# mark the user as 'downloaded'
+
# run content importer
+
  
 
== Other Ideas ==
 
== Other Ideas ==
Line 84: Line 61:
  
 
We were unable to transfer your journal OTHERNAME from SERVICE to your account USERNAME on Dreamwidth.  FAILURE MESSAGE.
 
We were unable to transfer your journal OTHERNAME from SERVICE to your account USERNAME on Dreamwidth.  FAILURE MESSAGE.
 +
 +
We apologize for the problem and hope things can be resolved.
 +
 +
--Dreamwidth
  
 
==== Possible failure messages ====
 
==== Possible failure messages ====
  
 
* The username and password you gave us for SERVICE was rejected, so we couldn't download any information.
 
* The username and password you gave us for SERVICE was rejected, so we couldn't download any information.
 +
* SERVICE does not seem to be LiveJournal-based.
 +
* SERVICE does not seem to exist.  If you are sure the service exists pleae check the service URL you gave us and try again later.
  
 
=== Journal transferred ===
 
=== Journal transferred ===
Line 96: Line 79:
 
Congratulations, USERNAME!
 
Congratulations, USERNAME!
  
Your journal OTHERNAME on SERVICE has been transferred to your account USERNAME on Dreamwidth. (OPTIONAL: We did run into a couple of possible issues, however:
+
Your journal OTHERNAME on SERVICE has been transferred to your account USERNAME on Dreamwidth.  
  
PROBLEM LIST
+
(OPTIONAL ISSUE/NOTIFICATION TEXT INSERTS: Things you might want to look at:)
  
In order to fix this, we suggest you SOLUTIONS.)
+
HAPPY STATEMENT GOES HERE
 +
 
 +
-- Dreamwidth
  
 
==== Possible issue texts ====
 
==== Possible issue texts ====
  
* Some entries couldn't be imported.
+
===== Icon import =====
* We were unable to import your comments.
+
* We were unable to import your userpics.
+
* etc
+
  
=== Ways to enter a service name ===
+
* Because you have more icons on SERVICE than we can import, we have imported your default icon.  You can choose which other icons you wish to import at URL.
 +
* Unfortunately, SERVICE doesn't support icon importing.  You will have to manually upload your icons from this service.
 +
 
 +
===== Entries import =====
 +
 
 +
* Some entries couldn't be imported.  You can view a list and the reasons at URL.
 +
* We can't automatically transfer polls.  You can view a list of entries with polls so you can manually recreate them at URL.
 +
* We can't automatically transfer embedded content, such as YouTube videos.  You can view a list of those entries containing embedded content and manual re-insert the item at URL.
 +
 
 +
===== Comments import =====
 +
 
 +
* Unfortunately, SERVICE doesn't support comment importing, so we won't be able to import your comments.
 +
* We're not sure why we weren't able to import a comment on POST, but it's missing and so we can't import it or any of its replies.
 +
 
 +
== Ways to enter a service name ==
 
(these provided by <ljuser>cheyinka</ljuser>)
 
(these provided by <ljuser>cheyinka</ljuser>)
  
* www.somejournal.com
+
* www.somejournal.com '''WORKS'''
* http://www.somejournal.com
+
* http://www.somejournal.com '''WORKS'''
* username.somejournal.com
+
* username.somejournal.com '''WORKS'''
* http://username.somejournal.com
+
* http://username.somejournal.com '''WORKS'''
* somejournal.com
+
* somejournal.com '''WORKS'''
* somejournal
+
* somejournal '''can't work'''
* SJ
+
* SJ '''can't work'''
* username (the ever-popular "didn't read the directions" option)
+
* username (the ever-popular "didn't read the directions" option) '''can't work'''
* internet explorer / semagic / &c. (probably indistinguishable from 'username' as far as the code's concerned)
+
* internet explorer / semagic / &c. (probably indistinguishable from 'username' as far as the code's concerned) '''can't work'''
 +
 
 +
From [[User:John|John]]:
 +
 
 +
* on somejournal '''can't work'''
 +
* somejorunal or similar misspellings, though no idea how to fix that. '''can't work'''
 +
* AOL (:D) '''can't work'''
 +
* username at/on somejournal '''can't work'''
 +
* username@somejournal.com '''WORKS'''
 +
* username at somejournal dot com '''WORKS'''
 +
* SJ (where there are two services with the same initials. Perhaps not so much a consideration now, but in the event of super-federation Dreamwidth extreeeme, a possibility) '''can't work'''
 +
 
 +
From Janine:
 +
 
 +
* http://www.somejournal.com/users/username '''WORKS'''
 +
* http://www.somejournal.com/~username '''WORKS'''
 +
* http://users.somejournal.com/username '''WORKS'''
 +
* A link to a particular entry in the journal, or really any other page within the journal '''WORKS'''
 +
 
 +
== Compatibility ==
 +
 
 +
Output from test tool found here: http://linode2.andreanall.com/~anall/hidden/check_compat.txt
 +
 
 +
=== LiveJournal ===
 +
 
 +
Works just fine.
 +
 
 +
=== DeadJournal ===
 +
 
 +
          Cannocalize: OK (www.deadjournal.com)
 +
      SessionGenerate: OK
 +
              UserURL: OK ('''redacted''')
 +
            Userpics: FAIL
 +
              Groups: OK
 +
                Tags: OK
 +
                  Bio: OK (downloaded)
 +
            SyncItems: OK
 +
            GetEvents: OK
 +
          CommentMeta: FAIL
 +
          CommentMeta: FAIL
 +
www.deadjournal.com Compatible: FAIL
 +
 
 +
DeadJournal appears to be rejecting my ljsession cookie for some reason.
 +
 
 +
=== JournalFen ===
 +
Seems to be all OKs
 +
 
 +
=== InsaneJournal ===
 +
All OK.
  
 
[[Category:Design specs]]
 
[[Category:Design specs]]

Latest revision as of 02:11, 5 February 2009

TODO

  • Get it compiling after some changes merged in from test tool.
  • Userpics
    • Need to fail cleaner if this is missing (like on DJ)
  • Entries Almost done
    • Probably need to scrub a few more things.
 <lj-template name="qotd" id="..." /> for instance
  • Friends
    • I think this is wanted, import them as OpenID?
  • Comments
    • I need to find a proper "insert comment and metadata in the same instant" command, as I don't want a possible instant where someone can see a comment that is supposed to be screened.

Other Ideas

  • [info]exor674 I think one of the things we'd need to add for the import feature is a external-logid logprop, so if somebody tries to import twice you won't get twice of the entries -- AND the systen can know "oh DW entry 1 was originally livejournal.com-exor674-5827377212221211"
  • "Imported from" tag and link
  • Options: "import tags, tag prefix, tag suffix" "import comments"? "import friend groups, tag prefix, tag suffix" "import all as *security*"

Importing Comments

Supposing comments belonging to others were imported and privately posted, and comments belonging to the user were imported and posted according to their current screening setting in proper threads with the private comments. This would allow a maximum amount of the user's own content to show up.

Supposing also that the comment screening mechanism were modified to handle privately-posted imported comments. This would be done according to the screening settings on the original post: if unscreened on the original, someone OpenID-authenticated as the original comment poster could elect to own and unprivatescreen comments belonging to them.

After this point, the journal owner could screen/unscreen the comment at will.

To make this work better, imported comments should be listed somewhere that the OpenID owner of the comments could find them and possibly mass-unprivatescreen/take ownership.

[Business decision from [info]rahaeli:

If someone is importing their journal directly from another site, by giving us their username and password for that site, import all comments as OpenID comments with the same screening/visibility level that the original comment had.

If someone is importing their journal from downloaded backup file, import all comments as OpenID comments screened, frozen, visible only to the journal owner and the OpenID identity that made the comment, and unable to be unscreened/unfrozen by the journal owner him/herself, only the OpenID identity. That way, the OpenID identity holder can verify that their words haven't been edited offline.

Either way, it would be very very nice if there were a place where an OpenID identity could go and see all their OpenID comments that have been imported and take ownership of them under a specific linked DW account.]

Email templates

These email templates will be sent out when we manage or fail to transfer a journal.

Could not access account

Title
Could not transfer your journal from SERVICE to your account USERNAME on Dreamwidth
Body
Hi, USERNAME--

We were unable to transfer your journal OTHERNAME from SERVICE to your account USERNAME on Dreamwidth. FAILURE MESSAGE.

We apologize for the problem and hope things can be resolved.

--Dreamwidth

Possible failure messages

  • The username and password you gave us for SERVICE was rejected, so we couldn't download any information.
  • SERVICE does not seem to be LiveJournal-based.
  • SERVICE does not seem to exist. If you are sure the service exists pleae check the service URL you gave us and try again later.

Journal transferred

Title
About your journal transfer from SERVICE to your account USERNAME on Dreamwidth
Body

Congratulations, USERNAME!

Your journal OTHERNAME on SERVICE has been transferred to your account USERNAME on Dreamwidth.

(OPTIONAL ISSUE/NOTIFICATION TEXT INSERTS: Things you might want to look at:)

HAPPY STATEMENT GOES HERE

-- Dreamwidth

Possible issue texts

Icon import
  • Because you have more icons on SERVICE than we can import, we have imported your default icon. You can choose which other icons you wish to import at URL.
  • Unfortunately, SERVICE doesn't support icon importing. You will have to manually upload your icons from this service.
Entries import
  • Some entries couldn't be imported. You can view a list and the reasons at URL.
  • We can't automatically transfer polls. You can view a list of entries with polls so you can manually recreate them at URL.
  • We can't automatically transfer embedded content, such as YouTube videos. You can view a list of those entries containing embedded content and manual re-insert the item at URL.
Comments import
  • Unfortunately, SERVICE doesn't support comment importing, so we won't be able to import your comments.
  • We're not sure why we weren't able to import a comment on POST, but it's missing and so we can't import it or any of its replies.

Ways to enter a service name

(these provided by [info]cheyinka)

  • www.somejournal.com WORKS
  • http://www.somejournal.com WORKS
  • username.somejournal.com WORKS
  • http://username.somejournal.com WORKS
  • somejournal.com WORKS
  • somejournal can't work
  • SJ can't work
  • username (the ever-popular "didn't read the directions" option) can't work
  • internet explorer / semagic / &c. (probably indistinguishable from 'username' as far as the code's concerned) can't work

From John:

  • on somejournal can't work
  • somejorunal or similar misspellings, though no idea how to fix that. can't work
  • AOL (:D) can't work
  • username at/on somejournal can't work
  • username@somejournal.com WORKS
  • username at somejournal dot com WORKS
  • SJ (where there are two services with the same initials. Perhaps not so much a consideration now, but in the event of super-federation Dreamwidth extreeeme, a possibility) can't work

From Janine:

Compatibility

Output from test tool found here: http://linode2.andreanall.com/~anall/hidden/check_compat.txt

LiveJournal

Works just fine.

DeadJournal

         Cannocalize: OK (www.deadjournal.com)
     SessionGenerate: OK
             UserURL: OK (redacted)
            Userpics: FAIL
              Groups: OK
                Tags: OK
                 Bio: OK (downloaded)
           SyncItems: OK
           GetEvents: OK
         CommentMeta: FAIL
         CommentMeta: FAIL
www.deadjournal.com Compatible: FAIL

DeadJournal appears to be rejecting my ljsession cookie for some reason.

JournalFen

Seems to be all OKs

InsaneJournal

All OK.