Looks like the Great Firewall or something like it is preventing you from completely loading www.skritter.com because it is hosted on Google App Engine, which is periodically blocked. Try instead our mirror:

legacy.skritter.cn

This might also be caused by an internet filter, such as SafeEyes. If you have such a filter installed, try adding appspot.com to the list of allowed domains.

Studying

This document gives an overview of the many things that need to be done to implement a fully functional Skritter client.

Note: If you're interested in embarking on such an adventure, you should probably contact us. We're still building all the systems required, and the information presented here is probably incomplete.

Terminology

  • Age and Due: An item's age is the ratio of time since it was last studied to how much time it was scheduled for. We generally want to study an Item when its age is 1. An item whose age is near 1 is due.
  • Containing Word: A multi-character word which contains a given character.
    The inverse of this is Component Character.
  • New/Old Items: Whether or not the given item has been reviewed by the User.

Related Docs

Loading Data

First thing to do is fetch Items using the Items Endpoint. These Items represent what the User is studying. You've got basically two options here:

1. Fetch As Needed ('Flash' method)

Fetch a small (10-100) batch of them using the 'next' ordering and periodically get more as needed. This method is relatively straightforward since you don't have to worry about synchronization. However, this system does not allow for fully offline study.

2. Full Account Sync ('iOS' method)

Use the Batch System to fetch all Items at once. Occasionally check the server for changes made from other clients. This is much more complicated to get exactly right, but in the end you get a more powerful client.

Choosing what to study

Only required if you do 'Full Account Sync'. Feel free to roll your own algorithm to order them for study. We suggest these features:

  • Order mainly based on age
  • If the chosen Item is for a character, and it's not for the definition, try to find a due Item of the same part for a containing word and study that one instead.
  • Spread out new Items among the old.
  • Remember to filter out the parts and styles the User isn't studying.
  • When many Items are overdue, it can be hard to recover. Triage by presenting Items that are less than 250% due first.
  • For Items of equal priority, order by parts: writing, reading, tone, then definition. This pretty much only happens for new Items which were all added at the same time for a given Vocab.
  • Re-order the Items regularly, if not every prompt. Age changes at different rates for different Items, so it's important to recalculate continuously.

Here's some example code from the iOS app for calculating 'readiness', which is then used to order prompts from lowest to highest.


    double ageWithLongShotDeprioritization( BOOL deprioritizeLongShots,
                                            double t,
                                            double last,
                                            double next,
                                            BOOL singleCharacter,
                                            NSString *key) {

        if(!last && next - t > 600)
            return SK_USER_ITEM_AGE_SPACED;

        if(!last || next - last == 1)
            return SK_USER_ITEM_AGE_NEW;

        // Makes equally due words come up before the characters contained within them
        double lengthPenalty = singleCharacter ? -0.02 : 0;

        // sanity check
        if (t < last)
            return 0.0;

        // Calculate rtd here to take into account spacing, not to use given rtd.
        double seenAgo = t - last;
        double rtd = next - last;
        double readiness = seenAgo / rtd;

        // If readiness is negative, it's an error, unless rtd is <= 1, in which case
        // it's been new and has been spaced, so we can just let it have a
        // negative readiness and never come up.
        if(readiness < 0 && rtd > 1) {
            DLog(@"Warning: last %.2f later than next %.2f for %@ (now is %.2f)",
                 last, next, key, now());
            readiness = 0.7;
        }

        // Tweak readiness to favor items that haven't been seen in a while at really
        // low readinesses, so that new items don't crowd them out and you see everything
        // when studying a small set of words.

        // I want it to grow logarithmically such that it'll give a very small but non-
        // negligible change to items that are older than a few minutes, while giving
        // about .50 boost after a year to something maxed out (10 years).

        // The boost should drop off quickly for items that have some readiness themselves.

        if(readiness > 0.0 && seenAgo > 9000.0) {
            double dayBonus = 1.0;
            double ageBonus = 0.1 * log(
                dayBonus + (dayBonus * dayBonus * seenAgo) * DAYS_IN_A_SECOND);
            double readiness2 = readiness > 1.0 ? 0.0 : 1.0 - readiness;
            ageBonus *= readiness2 * readiness2;  // Less bonus if ready
            readiness += ageBonus;
        }

        // Don't let anything long-term be more ready than 250%; deprioritize the really
        // overdue long shots so that the more doable stuff comes first (down to 150%).

        if(deprioritizeLongShots) {

            if(readiness > 2.5 && rtd > 600.0) {
                if(readiness > 20.0)
                    readiness = 1.5;  // Deprioritize only down to 150%
                else
                    readiness = 3.5 - pow(readiness * 0.4, 0.33333);
            }
            if(lengthPenalty && readiness > 1) {
                readiness = pow(readiness, 1 + lengthPenalty);
            }
        }

        return readiness;
    }
    

Presentation

There are many things to remember to control what is shown when for a Skritter prompt. This details what to show for the four prompt types (writing, tone, reading, definition); the two languages (Chinese and Japanese); single characters vs. multi-character words; various settings (hidden reading, hidden definition, raw squigs, colored tones, eccentric flavor, parts studied, styles studied, Heisig keywords, grading buttons, and tone buttons); and prompt states (whether there are sentences, mnemonics, custom definitions, audio files, uncommon characters, kanji-less writings, simp/trad differences, new words, and not-yet-correct words). It gets complicated.

Rather than discuss what happens when you can combine prompt types as on the Skritter web client (writing + tone together or reading + definition together when both related Items are due), we suggest that you do not attempt it, as although it is more efficient for the learner, it is probably not worth the complexity for the developer. If you do want to try it, just pay attention to the Skritter web client's behavior when deciding what to show/hide/obscure and when.

If you notice inconsistencies or mistakes below, let us know. This is as much to document how it should work so we ourselves remember as it is for guiding you in building clients.

General

  • Audio buttons (word, sentence, and, for Chinese, character) should reflect whether there exists audio or not (enabled vs. crossed-out speaker icons)
  • Raw squigs should convert to some sort of lightweight overlay upon character completion
  • The iOS app's way of handling back, next, erase, hint, show, and grading has shown to be more elegant and useful than the web app's grading buttons, although it's harder to implement and requires more instruction
  • It's important to make finished prompts look pleasant and significantly different from unfinished prompts, ideally with an animated transition
  • Do not show score indicators at the beginning of prompts (since although prompts start off as correct, to Users it should seem like they start off as unscored), but do show them the first time the score changes for a prompt.

Writing Prompts

When the prompt is first displayed, show:

  • Eight-section writing grid
  • Word definition
  • Word reading with tone marks, with (for Japanese) some sort of emphasis on the kana group that corresponds with the current character's kanji group.
  • Character reading with tone marks, or with katakana first and separate from hiragana
  • Play the word audio if on the first character, then (for Chinese) the current character audio
  • The completed characters written so far in the word
  • A blank space or underline for each character yet to be written, with some sort of emphasis on the current character
  • The example sentence, if any, with all CJK characters shared between word and sentence cloze-deleted (blanked and underscored)
  • Optional: a button to select a different example sentence
  • Optional: an Item status line (ex.: "studied 7 wks ago: due (146%)" with hover description "You forgot the writing of this word 7 weeks ago (scheduled for 5 weeks), so it's now 146% ready. You are reviewing it on time.")
  • Optional: a list source line (ex.: "Added from Body parts (detailed): Upper + lower torso and abdomen")
  • If new word: a "New Word" indicator
  • If different between simp/trad and User is studying both: a "simp" or "trad" indicator for the current character and current word
  • If tone coloring: color each reading syllable according to its tone, and optionally also change current character's squig writing color from black to its primary tone color
  • If Heisig keywords: prefix character definition with bolded keyword and a dash (ex: Bosom - chest; bosom; heart)

Do not show, but make quickly accessible without having to show the answer:

  • Character definition
  • Selected character mnemonic
  • Optional: selected word mnemonic
  • If hidden reading mode: word and character readings and audio
  • If hidden definition mode and (not hidden reading mode or exists an example sentence): word definition

Show when show button is pressed:

  • Character writing guide (to fade out when writing begins)
  • Character decomposition
  • If hidden reading mode: the current character's reading in word reading

When erase button is pressed, try to reset:

  • Any written strokes
  • Character writing guide
  • Current completed character in word writing and sentence cloze deletion

When the character is finished, show:

  • Current character in word writing and sentence cloze deletion
  • Character writing, definition, and decomposition
  • Selected character mnemonic
  • Optional: selected word mnemonic
  • Grading buttons (if enabled and setting is used)
  • Score color (could be a glow, an outline, a currently selected grading button color, etc.--but make the "wrong" color pink instead of red when the prompt has never yet been answered correctly)
  • If hidden reading mode, current character's reading in word reading and character audio
  • If hidden reading mode and last character in word, word audio
  • If hidden definition and last character in word: word definition
  • If correct, some positive feedback (such as a randomized congrats message, or if eccentric flavor is enabled, something stranger)

Tone Prompts

Important differences from writing prompts:

  • Do not display the writing grid
  • Display a differently-colored character outline
  • Show/hint/wrong tones should display the correct tone but mark the prompt wrong
  • Wrong tones should display a message, "Should be 3rd, not 2nd."
  • For mobile-oriented clients, it is optional but not recommended to implement the tone buttons setting--better to leave the buttons off
  • Do follow the hidden reading mode
  • Show toneless pinyin at all times for syllables whose tone prompt is not complete, and unless audio buttons are specifically pressed, don't play audio until corresponding tone marks are also revealed
  • Do not cloze delete sentences, but do underline the prompted word in the sentence
  • In a multiple-character word, instead of displaying the character reading at first, display the character writing

Reading and Definition Prompts

When not doing typing for reading prompts, reading and definition prompts are very similar to each other. Some notes:

  • The web version of Chinese reading prompts allows typing pinyin, which most Users like, but which is inconvenient on mobile and quite complicated to implement properly. We just never got around to it for Japanese reading prompts on the web
  • Show a big version of the word's writing to prompt for the reading or definition
  • Try to make the reading and definition prompts look significantly different, aesthetically
  • Include some sort of "What is the pinyin?" or "What is the definition" message to prevent Users' confusion on which to do
  • It can be helpful to preserve writing / reading / definition placement between reading and definition prompt layouts so as to reinforce spatially which location should contain which field in Users' minds
  • The answer field (reading or definition) should be larger than the supplementary field (definition or reading)
  • For reading prompts, don't show the definition until afterward
  • For definition prompts, don't show the reading until afterward
  • Show the example sentence, word mnemonic, after the prompt has been completed
  • For reading prompts, play audio only after completion
  • Show/hint/correct functions should complete the prompt, but leave it marked correct (except for pressing the correct button)
  • It is not necessary to provide single-character component information in the case of multi-character words

Scoring

Tone, non-typing reading, and definition prompts will normally only be scored 1 ("forgot", or "don't know" for a word which has never yet been answered correctly) or 3 ("got it"). The User may use grading buttons to mark scores of 2 ("so-so") and 4 ("too easy"), but these won't be auto-assigned except for in writing prompts.

For writing prompts, characters should automatically mark themselves wrong after more than totalStrokes / 4 + 3 strokes are rejected, where totalStrokes is the maximum number of strokes in the longest stroke order for that character. Optionally, characters should mark themselves so-so 2 strokes before that threshold. More than three rejected strokes in a row should trigger a hint. Explicitly asking for a hint (like by a single tap) should increment the number of wrong strokes by 1/4 of the allowed wrong stroke threshold, rounded down. You can play around with these numbers, but they're hard to tune so that everyone is pleased.

When doing a multiple-character prompt (writing or tone), you generate scores for both the word-level item and each character-level item. Here's an example of how you determine the word-level score based on each character-level score:


    wrongCount = len([prompt for prompt in characterPrompts if prompt.score == 1])

    // If there are just 2 chars, and you get one wrong, you got it wrong.
    if wrongCount == 1 and len(characterPrompts) == 2
        wordScore = 1

    // If you get 2+ chars wrong, you got it wrong.
    elif wrongCount >= 2
        wordScore = 1

    // Round it down; decided this was better than rounding up since
    // most of the time it will turn into a 2, which is good.
    else
        total = sum([prompt.score for prompt in characterPrompts])
        average = total / len(characterPrompts)
        wordScore = math.floor(scoreFloat)

    

Note that if any of the wrong characters are not also added for study on their own, then the whole word should get marked wrong. An Item is being studied on its own if it contains any values in its vocabIds property.

For the iOS app, we implemented a more complex mechanism whereby character-level items that are less than 70% due are not submitted when marked wrong when the word is marked wrong, on the hypothesis that the user forgot only which character it was in that word, not how to write that character on its own. This turns out to do the right thing most of the time and prevent you from needing to provide manual char- vs. word-score override buttons. Example implementation:


    - (void)decideWhetherToSubmitSubUI:(SKUserItem *)subUI {
        // Add a flag saying whether to actually submit this review.
        // (If we get the prompt wrong, but we know the character much better than the word,
        // then we'll say that we didn't really fairly test the character,
        // so we'll just ignore the review instead of submitting it.)

        // If the char is at least 70% as ready as the word is, then we'll count it wrong.

        // If the sub-UI is not active for study by itself
        // (most common in Japanese, still common in Chinese), then the user doesn't
        // really care about whether it comes up for review on its own, so we'll not skip it.

        subUI.skipThisReview = (subUI.scoreThisReview == 1 &&
                                self.ui.scoreThisReview == 1 &&
                                subUI.style != SK_NO_STYLE &&
                                [subUI ageWithLongShotDeprioritization:NO] <
                                        0.7 * [self.ui ageWithLongShotDeprioritization:NO]);

        if( subUI.scoreThisReview == 1 &&
            self.lang == SK_ZH &&
            self.part == SK_TONE &&
            ([subUI.base isEqualToString:@"一"] || [subUI.base isEqualToString:@"不"])) {
                // These two have tone sandhi; everyone knows their real tone.
                subUI.skipThisReview = YES;
        }
    

It is also helpful to skip submission of tone prompt reviews when the answered tone is correct for the character itself, but wrong in the context of the word.

Scheduling

See the Scheduling page for details on how to calculate the new interval for an Item after it has been studied.

You will need SRS Config Entity data to do the scheduling properly. Every hundred reviews or so, you should update these entities from the server to get the latest values.

Spacing Items

Items should also be spaced. If you study an Item for a given Vocab, all other Items that are for that share the same 'base' should have their 'next' property pushed back so there's some minimum distance. An Item's 'base' is the second value in the Item's id if you divide by hyphens. So if an Item's id is "ja-日本-0", then the base is "日本". The server spaces Items by one fifth of their current interval (minimum ten minutes), or twelve hours if it's a new Item. If the Item is already scheduled after the calculated spacing distance, the Item's next value does not change. Spacing Items ensures Items for the same Vocab don't overlap one another too much, and parts of a Vocab a User is having trouble with get priority. Clients should try to match the spacing the server does, though it does not need to be exact.

Updating your state

First, create the Review objects which will be sent to the server to report what the User has studied. There should be one for every Item involved (ie if you study a multi-character word for any part other than the definition, Items for each Vocab listed in the word's containedVocabIds should also have their own separate Review objects). Most properties should be self explanatory, but here are some other things you should know:

  • bearTime is used for recording how much time the User spent studying on this day. Since Reviews overlap, it doesn't make sense to count every single Review's reviewTime. You should generally choose Reviews for parents to count toward the day's total time spent studying. For example, if you study the word "电话", this will affect Items for "电话", "电", and "话". Let the items for "电话" have bearTime set to true, and false for the others.
  • There are many 'interval' properties. They are named in relation to before the Item was studied. So currentInterval is what it was before the prompt.
  • The previousInterval and previousSuccess properties are used to determine which SRS Config values were used the last time this Item was studied. By knowing which values were used, and the result of this Review, the server can update the values to try and closer match the User's target retention rate.
  • Word group is used to group together all Reviews in a prompt together. This is mainly to handle the edge case where a word contains the same character multiple times, and so incurs multiple Reviews. Such reviews after the first one are largely ignored by the server, except for how much time was spent on them. The client should handle this case by creating these duplicate Reviews, but ignoring them for calculating scheduling.

Then, update the Item objects:

  • Add the length of time between when the prompt was shown and when the prompt's answer was either given or shown to timeStudied.
  • Set next to the current time plus the Review's submitTime.
  • Set last and changed to the Review's submitTime.
  • Set previousInterval to the current (unupdated) interval value.
  • Set interval to the calculated interval.
  • Add one to reviews.
  • If the score was 2 or higher, add one to successes.
  • Set previousSuccess to whether or not the score was 2 or higher.

Note: Be wary of system time changes. These happen every so often and will throw off your time tracking. Make sure your time tracking values are not made negative because of these changes.

POSTing data

Use the Reviews Endpoint to save your Reviews to the server. Aside from the Review objects themselves, you'll also need to provide the date the Reviews you are submitting happened. Skritter gets the date for the User's current timezone, shifting time by four hours so that the hours between midnight and 4AM still counts toward the previous day. Get the current local time from your client and use that to determine the date, doing the same four hour time shift calculation.

You don't necessarily have to save these Reviews to the server immediately. Batches of a dozen or more are fine, and the server can optimize updating things such as SRS values and the day's progress stats the larger the batches are.

Multi-day Batches

If your client supports offline studying, and you have more than one day's Reviews built up, save them one day at a time with at least ten seconds between each batch submission. The Review processing system is built for the Flash client, which saves prompts shortly after they are generated. This is set to be changed in the future to be more flexible and not requiring such careful measures, but for the time being be careful in this scenario.

Right now, you cannot save overall progress data to days before the last update to a person's account. So if you have reviews from Monday, but the user has already saved reviews to her account on Tuesday, you cannot save those overall progress stats to Monday. You will need to set it to a more recent date. Individual Items will still retain their proper values (saying exactly when they were last studied) but overall progress stats will be applied to the wrong date.

Synchronization

Invariably, data gets out of sync. When you try to save Reviews based on old data, the server will recognize this and revert or recalculate the interval on the fly. But it's up to the client to realize when data gets out of sync, and handle them correctly.

First, occasionally poll the Review Errors Endpoint to know which Reviews you submitted didn't go through. If something did go wrong, fetch the Item again from the server and continue on based on what the server decided.

Second, your client should guard itself when other clients update the User's data. There are multiple strategies for doing this:

  • Use the Items endpoint occasionally to get a summary (using the fields parameter) of Items that have changed since last you checked and make sure the server's interval values match your client's. If they don't, fetch those whole Items again.
  • When online, check the next few scheduled Items to make sure they match the server before studying them.
  • Automatically do a full download of the User's data occasionally, and provide a mechanism for the User to do this manually if they wish.
  • Only support online studying, and instead of maintaining a full copy of the User's data, fetch small batches of Items to study as needed directly from the server.

Keeping Vocabs Up To Date

In progress.