Check out these Yomichan dictionaries
My contributions to the Japanese learning community. For questions and support, please make a thread in the questions forum in TheMoeWay. For suggestions please mention @Marv.
- Contribution
- Other Resources
- Frequency Dictionaries
- Sorting Mined Anki Cards by Frequency
- Backfilling Stylized Frequencies in JP Mining Note
- Anki Card Blur
- Anki Automatically Highlight in Sentence
- Anki Automatic Hint Sentence for Kana Cards
- Yomichan Text Replacement Patterns
- Fixing the Font Language in Anki
Contributions are welcome, feel free to open a pull request. Note that there is a Prettier config file in the repo for auto formatting with the extension.
In addition, the Markdown All in One extension can be used to automatically generate and update a table of contents as well as assist in markdown editing.
- JP resources by Aquafina water bottle and a very promising Anki note template (maintenance now taken over by Arbyste)
- arujisho - the BEST android dictionary.
These are absolutely essential.
Much thanks to:
- Renji-xD for originally rewriting the handlebar to find a minimum value.
- KamWithK for developing cool Anki addons to use with this guide.
- Aquafina-water-bottle for much handlebar wizardry in rewriting the frequency handlebar to be radically better and developing a python script that greatly improves the backfilling process.
- GrumpyThomas, pj, and aka_baka for some suggestions.
- Michel for converting the Chinese txt frequency files
I sometimes get asked about what frequency dictionaries to use and the differences between them, so here are a few essential dictionaries I would recommend.
- JPDB
- Frequency data scraped from https://jpdb.io in May of 2022. Due to the way the data was scraped, some terms are missing frequencies and the jpdb dictionary itself is limited to terms in JMDict. For example, 経緯 only has an entry for the いきさつ reading so it should not be used as a dictionary for sorting (the more common/correct reading is けいい). The corpus of JPDB is quite good for immersion learners as it covers anime, dramas, light novels, visual novels, and web novels so the frequencies will be relatively accurate to what you're actually reading. This dictionary is notable for displaying the frequencies of kana readings separately, so you can often get a sense of how often a word is written with kanji or not.
- Innocent Ranked
- The Innocent Corpus from the Yomichan page but reordered to be sorted by rank. It is based on data from 5000+ novels. A weakness is that it does not differentiate based on reading, so all readings of a term will show the same value.
- BCCWJ
- From the publication:
-
The balanced corpus of contemporary written Japanese (BCCWJ) is Japan’s first 100 million words balanced corpus. It consists of three subcorpora (publication subcorpus, library subcorpus, and special-purpose subcorpus) and covers a wide range of text registers including books in general, magazines, newspapers, governmental white papers, best-selling books, an internet bulletin-board, a blog, school textbooks, minutes of the national diet, publicity newsletters of local governments, laws, and poetry verses.
- It has extremely wide coverage with most terms you'll encounter having an entry in this list even if other frequency lists don't. In addition, it differentiates between readings quite well. Make sure to install the LUW version as it has more terms.
- CC100
- Made by the mind behind arujisho, this uses the CC100 dataset which was made by crawling the web. Coverage is very wide, and there is reason behind the way readings are differentiated which is why I use this as my Yomichan sort dictionary.
When reading and adding cards from the content you're reading, you'll come across a variety of words with varying degrees of usefulness. Especially as a beginner, you'll want to learn the useful words as soon as possible and learn the less useful words later. With this we can sort a backlog of mined cards by frequency using various installed Yomichan frequency lists.
This handlebar for Yomichan will add a {freq}
field that will use your installed frequency dictionaries to send a numerical frequency value to Anki depending on the sort option applied, with the default being the (recommended) harmonic mean.
-
First, in your Anki card template create a new field for frequency, we can name this
Frequency
or whatever you like. -
Then in Yomichan options, insert the following handlebars code at the end of the menu in
Configure Anki card templates...
.[!NOTE] This is the same handlebar that is used in jp-mining-note, but with a different name and with the options included. If you want to use it with jp-mining-note, you can copy the code below and rename
freq
in the first line tojpmn-frequency-sort
, then remove the options section. -
In
Configure Anki card format...
, we may need to refresh the card model for the new field to show up.- To do this, change the model to something else and change it back.
⚠️ This will clear your fields, so take a screenshot to remember what you had.- You can try duplicating your card model in Anki and switching to/from that model, so hopefully your card fields will remain.
-
When your frequency field shows up, type in
{freq}
in its value box to use the handlebar.
The default settings within the freq
handlebars code should work for most people.
However, it can be customized if desired.
To access the settings, head back to Yomichan's templates (Yomichan options → Anki
→ Configure Anki card templates...
),
and view the lines right below {{#*inline "freq"}}
.
Ignoring Frequency Dictionaries
-
By default,
JLPT_Level
is ignored. If you want to ignore other dictionaries, edit theopt-ignored-freq-dict-regex
variable and join the dictionary names with|
. For example, to ignoreMy amazing frequency dictionary
, do the following:
Ignoring Frequency Values
-
By default, any frequency value containing
❌
is ignored as it represents a value that does not appear in JPDB. If you want to ignore other values, edit theopt-ignored-freq-value-regex
variable and join the ignored symbols with|
. For example, to also ignore entries containing a⚠
symbol, do the following: -
If you do not wish to ignore any values, remove the ❌ symbol from the regex.
Default Value For No Frequencies
-
When no frequencies are listed for the expression, the default value given is
9999999
. This puts the card at the very end of the queue.Some users may prefer setting the default value to 0, as it would be a conscious decision to add a term without a frequency value and you may want to prioritize learning the term immediately. To do this, change the
opt-no-freq-default-value
variable. For example:
Default Value For Grammar Dictionaries
-
By default, if you create a card for any grammar point, the frequency will be automatically set to
0
. This is because it is very likely that you would want to prioritize reviewing grammar points as much as possible.The
{freq}
handlebars code determines whether a card is a grammar point or not by your installed grammar dictionaries. If the definition within the exported term contains any grammar dictionary, then it is considered as a grammar point. Otherwise, the term is treated like any other term.Note: This may incorrectly override the frequency for some terms that might not be considered as a grammar point. For example, 以前 can be used as a standalone word, but is an entry under the 毎日のんびり日本語教師 dictionary. In other words, with this feature enabled, 以前 will have its
{freq}
value incorrectly overwritten (to 0 by default).This incorrect override usually only happens for very common words anyways (JPDB ranks 以前 as 721), so this should not be a very big problem.
The following table summarizes the options related to this.
Option Description opt-grammar-override
If set to true
(default), overrides the resulting frequency withopt-grammar-override-value
if at least one dictionary is determined to be a grammar dictionary. Set this variable tofalse
in order to disable the behavior.opt-grammar-override-value
The exact frequency value used for grammar dictionaries. opt-grammar-override-dict-regex
The regex used in order to determine if a dictionary is a grammar dictionary. Edit this like any other dict-regex
variable, i.e. by concatenating strings with|
.
Sorting Method
-
The sorting method determines the resulting value of
{freq}
. By default, the harmonic frequency is chosen. This can be modified by changingopt-freq-sorting-method
, e.g.The following table shows the available sorting methods. Note that these are case sensitive!
Sorting Method Description min
Gets the smallest frequency available. first
Gets the first frequency listed in Yomichan.
The order of frequency dictionaries is determined by thePriority
column under Yomichan settings →Configure installed and enabled dictionaries...
. Dictionaries are sorted from highest to lowest priority.avg
Gets the average (i.e. the arithmetic mean) of the frequencies. harmonic
Gets the harmonic mean of the frequencies, which can be thought of as an in-between of min
andavg
. See below for more details. This is the default value.debug
Internal mode to shows the dictionaries and frequencies for each dictionary, after being filtered from opt-ignored-freq-dict-regex
andopt-keep-freqs-past-first-regex
. Useful when testing the aforementioned regexes.The harmonic mean has the following properties that may make it more attractive to use over
avg
:- "The harmonic mean of a list of numbers tends strongly toward the least elements of the list."1
In other words, a frequency dictionary with an abnormally large value will not greatly
affect the resulting value.
Conversely, a frequency dictionary with an abnormally small value will affect the resulting
value more than
avg
, but still less so than simply usingmin
. - The harmonic mean is always greater than (or equal) to the minimum number and always less than (or equal) to the arithmetic mean.2
This makes
harmonic
ideal for people who want a statistic that takes into account all numbers, but does not arbitrarily deviate due to large outliers (whichavg
can easily do). - "The harmonic mean of a list of numbers tends strongly toward the least elements of the list."1
In other words, a frequency dictionary with an abnormally large value will not greatly
affect the resulting value.
Conversely, a frequency dictionary with an abnormally small value will affect the resulting
value more than
Reading Multiple Frequencies from the Same Dictionary
-
Some frequency dictionaries have multiple numbers displayed. Among these dictionaries, there are two ways that these these can be stored:
-
The frequency is stored as one string. For example, with 青空文庫熟語, the frequency is "160 (5406)". Only the first number (160) can be grabbed from this, and any numbers past this cannot be received without hacking the code.
-
The frequency is stored as multiple strings. For example with JPDB, the frequency for 読む is stored as "440" and "26189 ㋕" (with the latter being read as 21689).
By default, only the first number (440) will be considered in the sorting method. If you want the sorting method to also consider other numbers (such as 26189), add the desired dictionary to the
opt-keep-freqs-past-first-regex
variable, similarly to how dictionaries are added toopt-ignored-freq-dict-regex
(concatenated with|
).For example, adding JPDB to the variable will result in the following:
And adding JPDB VN3 万 as well will result in the following:
-
Use the AnkiAutoReorder addon to have your backlog sort automatically on refresh.
- Enter your search query (
search_to_sort
) and yoursort_field
into the addon's config (Tools > Addons > AutoReorder > Config
).- You can get the appropriate search query by going to the Browse window, then ctrl click your deck name and the "New" card state. The string at the top is the search query you can use in the addon settings, it should have the deck name and
is:new
.
- You can get the appropriate search query by going to the Browse window, then ctrl click your deck name and the "New" card state. The string at the top is the search query you can use in the addon settings, it should have the deck name and
- Then reorder your deck by frequency from
Tools > Reposition Cards
. Remember to do this every day after adding new cards.
I also recommend installing the Advanced Browser addon to display the frequency field in Anki's browse page.
Below: right click the column headers at the top with Advanced Browser installed to select new fields to be displayed.
Alternatively, after installing Advanced Browser, you could sort by the frequency field and press ctrl + a
then ctrl + shift + s
to select all cards and reorder.
If you already have a large backlog of old cards without frequency values, you might need to fill in these values first or they won't be sorted. There are two methods listed below to do exactly that. The command line method runs much faster than the Anki method, but requires some command line knowledge to pull off.
Of course, you could just opt to finish reviewing these cards first instead of backfilling the old cards.
Warning: Make sure to backup your collection before trying either method below.
Differences between the backfill .txt files
JPDB.txt
- Japanese list from jpdb.iocc100.txt
- The CC100 dataset as described in the Frequency Dictionaries section.vnsfreqSTARS.txt
andvnsfreq.txt
- Japanese frequency lists from visual novelsBLCUcoll.txt
andBLCUlit.txt
- Chinese frequency lists from colloquial and literary text from the BLCU BCC Corpus.SUBTLEX-CH.txt
- Chinese frequency list based on movie/TV subtitles from SUBTLEX-CH.
Note that the Japanese ones are selected by default when backfilling via the command line; you will have to use the --freq-lists
option to specify other lists.
Backfilling: Command Line (Recommended)
-
Install the latest version of Python if you do not have it already installed. Any Python version 3.8 or above should work.
-
Install AnkiConnect if you do not have it already installed.
-
Open Anki. If you just installed AnkiConnect, make sure to restart Anki so AnkiConnect is properly running.
- Note that this will not work via WSL due to networking constraints. If you are on Windows you will have to use Command Prompt or Powershell.
-
Run the following commands:
git clone "https://github.com/MarvNC/JP-Resources.git" cd JP-Resources cd frequency # Linux users might have to use `python3` instead of `python`. # Replace "Expression" with the exact field name that contains the word/expression. python backfill.py "Expression"
Here are some more examples on how to use
backfill.py
:# View all possible arguments. python backfill.py --help # Searches for the expression in the field "Word" instead of "Expression" # Note that this is case sensitive! python backfill.py "Word" # Sets all expressions without any found frequencies to the default value of '0'. python backfill.py "Expression" --default 0 # Uses the field "FrequencySort" instead of the default ("Frequency"). # This also changes the default query to search an empty `FrequencySort` field. python backfill.py "Expression" --freq-field "FrequencySort" # Uses a custom query instead of the default ("Expression:* Frequency:"). # Note: For powershell users, you must escape the quotes with an additional backtick: # --query "Frequency: \`"note:My mining note\`"" python backfill.py "Expression" --query "Frequency: \"note:My mining note\"" # This custom query can be used to override all of your existing frequencies, # instead of just backfilling. RUN THIS WITH CAUTION! python backfill.py "Expression" --query "\"note:My mining note\"" # Changes the order of which frequency list is used first. python backfill.py "Expression" --freq-lists "vnsfreq.txt" "JPDB.txt"
Backfilling: Within Anki
-
This is a hacky method to backfill your old cards. Again, make sure to backup your collection before attempting this, it could cause significant lag to your Anki. In addition, for users of Anki 2.1.50+ increase your backup interval before attempting the import as it will take a long time. A backup occurring while you're waiting on Anki to delete cards will just cause more lag.
-
Create a frequency list in
.txt
format that contains a list of expressions followed by frequency values. You can use the ones I have created here, I recommend downloading the JPDB list as it's the most exhaustive. However the VN Stars list also fills in some of the gaps that JPDB doesn't cover, so you could import it first, then import JPDB afterward for maximum frequency coverage. -
In Anki, create a new temporary deck and move your backlogged cards to the new deck, then tag them for later.
- Search for the backlogged new cards using
deck:{deckname} is:new
in your card browser, then hitctrl + a
to select them all thenctrl + d
to bring up the "Change Deck" menu from which you can create a new deck (namedtemp
or whatever you like) and move them. - Select this new deck, then tag them using
ctrl + a
thenctrl + shift + a
to add a new tag, where you can type in something likebacklog
.
- Search for the backlogged new cards using
-
With this temporary deck selected, go to File -> Import, then select the txt frequency list. Map the first field to your term/expression field, then the second field to your frequency field. Make sure to enable "Update existing notes when first field matches." Then import it to your temporary deck.
-
This will update your existing notes' frequency values, but it'll also import a LOT of new unneeded cards.
- Search for your backlogged cards using
tag:backlog
and then again hitctrl + a
thenctrl + d
to move them back to your vocabulary deck. Now we can simply delete the temporary deck along with the all the new cards that were added, just make sure you aren't deleting any actual cards first.
- Search for your backlogged cards using
-
Finally, you can right click the
backlog
tag in the sidebar and delete it.
If you frequently make cards that don't contain frequencies, such as sentence or grammar cards, you won't be able to pull frequencies from dictionaries. If you tag all of these cards specifically, you can use this plugin to generate random frequencies for these cards.
In the JP Mining Note Anki note type, there is also a FrequenciesStylized field for displaying the values from various frequency dictionaries on the front of the card. Due to the specific formatting requirements of this field, it cannot be backfilled with the above methods. A separate script is provided in the frequencies/frequenciesstylized
folder for this purpose.
Warning: As always, back up your entire collection before performing any steps from this section
Before running the script, you will need to configure the list of frequency dictionaries to be used:
The set of frequency dictionaries to use can be configured by editing the dict_names.py
file. The default values in this file are shown below:
dict_names = [
('JPDB-stylized.txt', 'JPDB'),
('../vnsfreq.txt', 'VN Freq'),
('JLPT-stylized.txt', 'JLPT')
]
The order of the dictionaries in this list determines the order that the frequencies will appear in the FrequenciesStylized field. Within each entry, the first parameter is the relative filepath to the frequency list, and the second parameter is the display name you want to use for that dictionary.
For example, the above configuration produces the following result for 返事:
If you change the dict_names.py
file to:
dict_names = [
('../vnsfreq.txt', 'VN Freq'),
('JPDB-stylized.txt', 'jpdb'),
('JLPT-stylized.txt', 'jlpt')
]
Then it will now produce this output: (note the lowercase dictionary names)
Note the ../
in the filepath for the VN Freq dictionary. This script can use any of the frequency lists that are used by backfill.py
. However, if there is a stylized version of a frequency list, then it is highly recommended that you use that one, rather than the simpler version. This is because the stylized version includes additional formatting, such as JPDB's ㋕ marker for kana frequencies.
Stylized versions of frequency lists also include the reading for each word, so if your cards have the WordReadingHiragana
field filled in, then the script can ensure that only the frequencies for the correct reading are used. If your notes do not have the WordReadingHiragana
field filled, then it's highly recommended that you fill it using the instructions on the JP Mining Note docs.
Included Stylized Frequency Dictionaries
JPDB-stylized.txt
- Same asJPDB.txt
above, but includes the ㋕ marker to indicate kana form frequency, and word readings to differentiate between different words that use the same kanji.cc100-stylized.txt
- The CC100 dataset as described in the Frequency Dictionaries section.JLPT-stylized.txt
- Provides the JLPT level for words tested on the JLPT. Extracted from stephenmk's yomichan-jlpt-vocab yomichan dictionary.
Once you have configured the list of dictionaries to use, you can run the script. The simplest way to run this script is to navigate into the frequencies/frequenciesstylized
folder, and run:
# Linux users might have to use `python3` instead of `python`.
python backfill-stylized.py
This will search your collection for all notes of type JP Mining Note
with an empty FrequenciesStylized
field. It will then fill those fields with the appropriate frequency information as determined by your configuration in dict_names.py
. It will also tag every note it modifies with the tag backfill-stylized
. There are two options that can be used with this script:
query
The --query
option works in the same way as it does in the standard backfill.py script. This allows you to use a custom query to find the cards to modify.
For example, if you want to overwrite the FrequenciesStylized field for all JP Mining Notes, and not just those where the field is already empty, you can use the following:
# This custom query can be used to override all of your existing frequencies,
# instead of just backfilling. RUN THIS WITH CAUTION!
python backfill-stylized.py --query "\"note:JP Mining Note\""
One thing to be careful of is that your custom query must only return notes of type JP Mining Note
with Word
and FrequenciesStylized
fields. If it returns any other type of note, it will throw an error. You can ensure only JP Mining Notes are returned by always including \"note:JP Mining Note\"
in your queries.
tag
By default, every note that is modified by this script will be tagged with the tag backfill-stylized
. This makes it easy to revert your changes if you make a mistake. To reset the modified cards and start again, search for them in the Anki browser using tag:backfill-stylized
, select all the cards, and then clear the FrequenciesStylized
field using the procedure in the next section.
Once you are happy with your cards, you can remove the tags by searching Anki for tag:backfill-stylized
, and using Notes -> Remove Tags...
to remove backfill-stylized
.
If you want to use a different tag, you can use the --tag
option:
# Tags all modified notes with "modified-stylized-freq"
python backfill-stylized.py --tag "modified-stylized-freq"
If you don't want the script to tag any notes, use --tag ""
# Prevents the script from tagging any notes
python backfill-stylized.py --tag ""
If you have never edited the FrequenciesStylized
field on a note, then it is probably completely empty, and backfill-stylized.py
will be able to find the note.
However, in some cases, the FrequenciesStylized
field might look empty, when in fact it has some hidden HTML tags in it. In this case, the script will not be able to find these notes, since it is only looking for notes where this field is empty.
FrequenciesStylized looks empty |
But it actually has hidden HTML elements |
---|---|
You can clear this HTML directly by clicking on the HTML toggle button marked in the above image. Then just delete the HTML from the editor.
If you need to completely clear the FrequenciesStylized
field for several cards at once, first select all the relevent cards in the Anki browser. Then, go to Notes -> Find and Replace...
and enter the options shown below.
WARNING: Unless you know exactly what you're doing, only use the options shown below. Using different options has the potential to delete an arbitrary amount of information from an arbitrary number of cards in your collection
After clicking OK, the FrequenciesStylized
field for all selected notes will be completely emptied.
When adding cards from VNs, we might find some risque content that we still want to look at while reviewing because it's cute. However, you might review in places where you don't always want other people to see your cards. Using this card template, we can blur media in Anki and have the option persist throughout a review session.
x | Blur disabled | Blur enabled |
---|---|---|
SFW | ||
NSFW |
Media: ハミダシクリエイティブ © まどそふと
-
Decide on a tag for NSFW cards. I use
-NSFW
so the tag is sorted first for easy access. If you choose something else you'll need to replace all instances of-NSFW
in this guide with your tag name (withctrl + h
in a text editor or an online tool). -
Tag your NSFW cards with this tag in Anki (see ShareX Hotkey).
-
Download the anki-persistence script (
minified.js
orscript.js
) from here. Then rename it__persistence.js
and place it in your Anki user/media folder.
- In your card template where you want the image to go, paste in this HTML, renaming
{{Picture}}
to match the name of the field that contains your media.
<div id="main_image" class="{{Tags}}">
<a onclick="toggleNsfw()">{{Picture}}</a>
</div>
- Then, at the end of the template paste in this code:
<script src="__persistence.js"></script>
<script>
// nsfw https://github.com/MarvNC/JP-Resources
(function () {
const nsfwDefaultPC = true;
const nsfwDefaultMobile = false;
const imageDiv = document.getElementById('main_image');
const image = imageDiv.querySelector('a img');
if (!image) {
imageDiv.parentNode.removeChild(imageDiv);
}
let loaded = false;
setInterval(() => {
if (!loaded) {
if (typeof Persistence === 'undefined') {
return;
}
loaded = true;
let onMobile = document.documentElement.classList.contains('mobile');
let nsfwAllowed = onMobile ? nsfwDefaultMobile : nsfwDefaultPC;
if (Persistence.isAvailable() && Persistence.getItem('nsfwAllowed') == null) {
Persistence.setItem('nsfwAllowed', nsfwAllowed);
} else if (Persistence.isAvailable()) {
nsfwAllowed = Persistence.getItem('nsfwAllowed');
}
setImageStyle(nsfwAllowed);
}
}, 50);
})();
function toggleNsfw() {
if (Persistence.isAvailable()) {
let nsfwAllowed = !!Persistence.getItem('nsfwAllowed');
nsfwAllowed = !nsfwAllowed;
Persistence.setItem('nsfwAllowed', nsfwAllowed);
setImageStyle(nsfwAllowed);
} else {
setImageStyle(undefined, true);
}
}
function setImageStyle(nsfwAllowed = undefined, toggle = false) {
const imageDiv = document.getElementById('main_image');
const image = imageDiv.querySelector('img');
if (nsfwAllowed != undefined) {
imageDiv.classList.toggle('nsfwAllowed', nsfwAllowed);
} else if (toggle) {
imageDiv.classList.toggle('nsfwAllowed');
}
}
</script>
Then in your card styling paste in the following css, making sure to replace -NSFW
with your tag name.
#main_image.nsfwAllowed {
border-top: 2.5px dashed fuchsia !important;
}
#main_image {
border-top: 2.5px solid springgreen;
}
#main_image img {
cursor: pointer;
}
#main_image.-NSFW {
border-left: 2.5px dashed red;
border-right: 2.5px dashed red;
border-bottom: 2.5px dashed red;
}
#main_image.nsfwAllowed.-NSFW {
border-top: 2.5px dashed red !important;
}
#main_image.-NSFW img {
filter: blur(30px);
}
#main_image.nsfwAllowed img {
filter: blur(0px) !important;
}
During a review session, you can click/tap the image to toggle card blurring. When the blurring is enabled, there will be a solid green line at the top of the image. When blurring is not enabled, there will be a fuchsia dotted line, and when the card is NSFW the borders will be dotted red. This option will persist throughout a review session but the setting will reset after exiting the session.
In the code we pasted in the template there are variables that can change whether blurring is enabled by default on desktop/mobile separately; the thought being that this script is primarily intended for reviewing on a phone. These variables can be changed with true
marking that cards will not be blurred by default.
const nsfwDefaultPC = true;
const nsfwDefaultMobile = false;
If you want all cards to be blurred by default and for it to stay that way, you can simply do something like this instead. The .mobile
part can be removed so it works on desktop as well.
<div class="main_image {{Tags}}">{{Picture}}</div>
.mobile .-NSFW img {
filter: blur(30px);
}
.mobile .-NSFW img:hover {
filter: blur(0px);
}
I use the hotkeys in this guide (highly recommended) for adding images/audio to new cards while reading. For the screenshot hotkey, I have a hotkey in addition to the normal one that adds a -NSFW
tag to the new card for convenience so they don't have to be tagged manually after creation. In the argument part of step 8, just use this code instead:
-NoProfile -Command "$medianame = \"%input\" | Split-Path -leaf; $data = Invoke-RestMethod -Uri http://127.0.0.1:8765 -Method Post -ContentType 'application/json; charset=UTF-8' -Body '{\"action\": \"findNotes\", \"version\": 6, \"params\": {\"query\":\"added:1\"}}'; $sortedlist = $data.result | Sort-Object -Descending {[Long]$_}; $noteid = $sortedlist[0]; Invoke-RestMethod -Uri http://127.0.0.1:8765 -Method Post -ContentType 'application/json; charset=UTF-8' -Body \"{`\"action`\": `\"updateNoteFields`\", `\"version`\": 6, `\"params`\": {`\"note`\":{`\"id`\":$noteid, `\"fields`\":{`\"Picture`\":`\"<img src=$medianame>`\"}}}}\"; " Invoke-RestMethod -Uri http://127.0.0.1:8765 -Method Post -ContentType 'application/json; charset=UTF-8' -Body \"{ `\"action`\": `\"addTags`\",`\"version`\": 6,`\"params`\": {`\"notes`\": [$noteid],`\"tags`\": `\"-NSFW`\"}}\";
It's good practice to have your word highlighted within the target sentence so it's easier to see. You can do this for new cards by using the cloze options in Yomichan, but that doesn't affect existing cards that don't have the word highlighted. Here's some code to highlight the target expression within already existing cards. It's quite flexible, being able to work to some degree for most cards even if the sentence doesn't exactly contain the expression, or if it contains the expression but in hiragana or katakana.
To use it, simply append the following script to the end of a card.
- You need to modify the lines specifying your field names by changing
{{Expression}}
and{{Reading}}
to match your field. - You also need to modify the selector to select the part of your card containing your sentence. An easy way to do this would be to wrap your sentence in a
div
with an id ofsentence
so the selector is#sentence
as it is by default. For example,<div id="sentence">{{Sentence}}</div>
.
<script>
// https://github.com/MarvNC/JP-Resources
(function () {
const expression = '{{Expression}}';
const reading = '{{Reading}}';
const sentenceElement = document.querySelector('#sentence');
highlightWord(sentenceElement, expression, reading);
})();
function highlightWord(sentenceElement, expression, reading) {
const sentence = sentenceElement.innerHTML;
if (!sentence.match(/<(strong|b)>/)) {
let possibleReplaces = [
// shorten kanji expression
shorten(expression, sentence, 1),
// shorten with kana reading
shorten(reading, sentence, 2),
// find katakana
shorten(hiraganaToKatakana(expression), sentence, 2),
// find katakana with kana reading
shorten(hiraganaToKatakana(reading), sentence, 2),
];
// find and use longest one that is a substring of the sentence
replace = possibleReplaces
.filter((str) => str && sentence.includes(str))
.reduce((a, b) => (a.length > b.length ? a : b));
sentenceElement.innerHTML = sentenceElement.innerHTML.replace(
new RegExp(replace, 'g'),
`<strong>${replace}</strong>`
);
}
}
// takes an expression and shortens it until it's in the sentence
function shorten(expression, sentence, minLength) {
while (expression.length > minLength && !sentence.match(expression)) {
expression = expression.substr(0, expression.length - 1);
}
return expression;
}
function hiraganaToKatakana(hiragana) {
return hiragana.replace(/[\u3041-\u3096]/g, function (c) {
return String.fromCharCode(c.charCodeAt(0) + 0x60);
});
}
</script>
Kana-only terms might be annoying to review in Anki as they're quite arbitrary and don't necessarily derive meaning from a kanji. This makes them potentially harder to recall than kanji terms, but not necessarily for much benefit as you'd come across onomatopoeia with context in the while making them somewhat self explanatory as to what they're describing.
Because of this, you might find it helpful to conditionally display the sentence on the front of your cards to be able to learn kana terms along with the context.
Media: 蒼の彼方のフォーリズム EXTRA1 © sprite
In order to conditionally display the sentence on the front, put the following html on the front of your card template where you want your hint sentence.
-
Replace all instances of
{{Sentence}}
with the name of your sentence field, and the same with{{Expression}}
in the code. -
The anchor linking to jpdb is completely optional, and is used to make easy searches on jpdb. If you don't want this you can just replace the second line with
{{Sentence}}
.
<div id="hintSentence" style="display: none">
<a href="https://jpdb.io/search?q={{Sentence}}">{{Sentence}}</a>
</div>
Put this code at the bottom of your front card template, making sure to rename {{Expression}}
to match your field name.
<script>
// https://github.com/MarvNC/JP-Resources
(function () {
// prevent loading this js on back side of card
if (document.getElementById('answer')) {
return;
}
const expression = '{{Expression}}';
const furigana = '{{Reading}}';
const kanjiRegex = /[\u4e00-\u9faf]/g;
if (!expression.match(kanjiRegex)) {
const hintSentence = document.getElementById('hintSentence');
hintSentence.style.display = 'block';
const sentenceElement = document.querySelector('#hintSentence a');
highlightWord(sentenceElement, expression, furigana);
}
})();
</script>
Some text replacement patterns in Yomichan Settings -> Translation -> Custom Text Replacement Patterns
that I've found useful for better parsing.
If it might save you some time, you can optionally download the text replacement patterns from here, export your config, replace them in the appropriate spot, and reimport. Thanks to Julian for providing the export.
- Some expressions may occasionally be written using numerals and most dictionaries only have entries for the kanji version. You could try replacing 0 with 十 and so on for larger numbers, but it dosen't seem to be worth it in my experience.
- 鯛も1人はうまからず
- 3種
1|1
-> 一
2|2
-> 二
3|3
-> 三
4|4
-> 四
5|5
-> 五
6|6
-> 六
7|7
-> 七
8|8
-> 八
9|9
-> 九
- Occasionally there are things with dots, dashes, or other miscellaneous things in it that you want to scan.
- コピ・ルアク
- 「ど、どうですか……?モノに、なってきてます……?」
・|、|\-|\.|‐|\s
-> (nothing)
- Sometimes katakana verbs will use ッ in the past tense form and won't be picked up by Yomichan.
- ハモッた
- テンパッた
ッ
-> っ
- I should also mention the most important replacement pattern, replacing the 々 with the previous kanji as most monolingual dictionaries don't have entries for the 々 version. Credits to TheMoeWay's guide for the idea.
- 囂々
- 侃々諤々
(.)々
-> $1$1
If you're displaying Japanese/Chinese/Korean text in Anki, you might often get incorrect glyphs, as there are differences in the display of unified Han characters for different languages. In general, I recommend setting a lang
tag in your Anki card template so that the card is rendered correctly.
At the beginning of your card template (both front and back sides) add the following line, replacing ja
with zh
, zh-hk
, zh-tw
or ko
as appropriate. You can look up the ISO language code for the language you want to display online.
<span lang="ja">
Then at the very bottom of your card template just add a closing span tag.
</span>
You may already have set a custom font using CSS which is a good way to customize your cards but it does not guarantee full compatibility in cases where the glyph is not present in the font you're using.