diff --git a/010_Intro/10_Installing_ES.asciidoc b/010_Intro/10_Installing_ES.asciidoc index 7e25b8978..d897c9552 100644 --- a/010_Intro/10_Installing_ES.asciidoc +++ b/010_Intro/10_Installing_ES.asciidoc @@ -8,22 +8,22 @@ Preferably, you should install the latest version of the((("Java", "installing") from http://www.java.com[_www.java.com_]. You can download the latest version of Elasticsearch from -http://www.elasticsearch.org/download/[_elasticsearch.org/download_]. +https://www.elastic.co/downloads/elasticsearch[_elasticsearch.co/downloads/elasticsearch_]. [source,sh] -------------------------------------------------- -curl -L -O http://download.elasticsearch.org/PATH/TO/VERSION.zip <1> +curl -L -O http://download.elastic.co/PATH/TO/VERSION.zip <1> unzip elasticsearch-$VERSION.zip cd elasticsearch-$VERSION -------------------------------------------------- <1> Fill in the URL for the latest version available on - http://www.elasticsearch.org/download/[_elasticsearch.org/download_]. + http://www.elastic.co/downloads/elasticsearch[_elastic.co/downloads/elasticsearch_]. [TIP] ==== When installing Elasticsearch in production, you can use the method described previously, or the Debian or RPM packages provided on the -http://www.elasticsearch.org/downloads[downloads page]. You can also use +http://www.elastic.co/downloads/elasticsearch[downloads page]. You can also use the officially supported https://github.com/elasticsearch/puppet-elasticsearch[Puppet module] or https://github.com/elasticsearch/cookbook-elasticsearch[Chef cookbook]. diff --git a/010_Intro/15_API.asciidoc b/010_Intro/15_API.asciidoc index 050438391..48685b23f 100644 --- a/010_Intro/15_API.asciidoc +++ b/010_Intro/15_API.asciidoc @@ -29,8 +29,7 @@ The Java client must be from the same _major_ version of Elasticsearch as the no otherwise, they may not be able to understand each other. ==== -More information about the Java clients can be found in the Java API section -of the http://www.elasticsearch.org/guide/[Guide]. +More information about the Java clients can be found in https://www.elastic.co/guide/en/elasticsearch/client/index.html[Elasticsearch Clients]. ==== RESTful API with JSON over HTTP @@ -41,8 +40,8 @@ seen, you can even talk to Elasticsearch from the command line by using the NOTE: Elasticsearch provides official clients((("clients", "other than Java"))) for several languages--Groovy, JavaScript, .NET, PHP, Perl, Python, and Ruby--and there are numerous -community-provided clients and integrations, all of which can be found in the -http://www.elasticsearch.org/guide/[Guide]. +community-provided clients and integrations, all of which can be found in +https://www.elastic.co/guide/en/elasticsearch/client/index.html[Elasticsearch Clients]. A request to Elasticsearch consists of the same parts as any HTTP request:((("HTTP requests")))((("requests to Elasticsearch"))) diff --git a/010_Intro/20_Document.asciidoc b/010_Intro/20_Document.asciidoc index b27b19bee..fcd1151f8 100644 --- a/010_Intro/20_Document.asciidoc +++ b/010_Intro/20_Document.asciidoc @@ -51,8 +51,8 @@ a flat table structure. ==== Almost all languages have modules that will convert arbitrary data structures or objects((("JSON", "converting your data to"))) into JSON for you, but the details are specific to each -language. Look for modules that handle JSON _serialization_ or _marshalling_. http://www.elasticsearch.org/guide[The official -Elasticsearch clients] all handle conversion to and from JSON for you +language. Look for modules that handle JSON _serialization_ or _marshalling_. The official +https://www.elastic.co/guide/en/elasticsearch/client/index.html[Elasticsearch Clients] all handle conversion to and from JSON for you automatically. ==== diff --git a/010_Intro/30_Tutorial_Search.asciidoc b/010_Intro/30_Tutorial_Search.asciidoc index 7acc66d1d..fe7adebe5 100644 --- a/010_Intro/30_Tutorial_Search.asciidoc +++ b/010_Intro/30_Tutorial_Search.asciidoc @@ -445,5 +445,5 @@ HTML tags: <1> The highlighted fragment from the original text You can read more about the highlighting of search snippets in the -http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-highlighting.html[highlighting reference documentation]. +{ref}/search-request-highlighting.html[highlighting reference documentation]. diff --git a/030_Data/45_Partial_update.asciidoc b/030_Data/45_Partial_update.asciidoc index 633f16c78..dc3cf6d0f 100644 --- a/030_Data/45_Partial_update.asciidoc +++ b/030_Data/45_Partial_update.asciidoc @@ -121,7 +121,7 @@ for example your Elasticsearch endpoints are only exposed and available to trust then you can choose to re-enable the dynamic scripting if it is a feature your application needs. You can read more about scripting in the -http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-scripting.html[scripting reference documentation]. +{ref}/modules-scripting.html[scripting reference documentation]. **** diff --git a/050_Search/20_Query_string.asciidoc b/050_Search/20_Query_string.asciidoc index 4b15e969b..f4340dab8 100644 --- a/050_Search/20_Query_string.asciidoc +++ b/050_Search/20_Query_string.asciidoc @@ -116,7 +116,7 @@ readable result: As you can see from the preceding examples, this _lite_ query-string search is surprisingly powerful.((("query strings", "syntax, reference for"))) Its query syntax, which is explained in detail in the -http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#query-string-syntax[Query String Syntax] +{ref}/query-dsl-query-string-query.html#query-string-syntax[Query String Syntax] reference docs, allows us to express quite complex queries succinctly. This makes it great for throwaway queries from the command line or during development. diff --git a/052_Mapping_Analysis/40_Analysis.asciidoc b/052_Mapping_Analysis/40_Analysis.asciidoc index 0f93e8c4f..2fd738a3c 100644 --- a/052_Mapping_Analysis/40_Analysis.asciidoc +++ b/052_Mapping_Analysis/40_Analysis.asciidoc @@ -72,7 +72,7 @@ lowercase. It would produce Language analyzers:: -Language-specific analyzers ((("language analyzers")))are available for http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-lang-analyzer.html[many languages]. They are able to +Language-specific analyzers ((("language analyzers")))are available for {ref}/analysis-lang-analyzer.html[many languages]. They are able to take the peculiarities of the specified language into account. For instance, the `english` analyzer comes with a set of English ((("stopwords")))stopwords (common words like `and` or `the` that don't have much impact on relevance), which it @@ -202,7 +202,7 @@ that the original word occupied in the original string. TIP: The `type` values like `` vary ((("types", "type values returned by analyzers")))per analyzer and can be ignored. The only place that they are used in Elasticsearch is in the -http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/analysis-intro.html#analyze-api[`keep_types` token filter]. +{ref}/analysis-keep-types-tokenfilter.html[`keep_types` token filter]. The `analyze` API is a useful tool for understanding what is happening inside Elasticsearch indices, and we will talk more about it as we progress. diff --git a/060_Distributed_Search/15_Search_options.asciidoc b/060_Distributed_Search/15_Search_options.asciidoc index f170c5295..25e80bfe5 100644 --- a/060_Distributed_Search/15_Search_options.asciidoc +++ b/060_Distributed_Search/15_Search_options.asciidoc @@ -8,7 +8,7 @@ The `preference` parameter allows((("preference parameter")))((("search options" used to handle the search request. It accepts values such as `_primary`, `_primary_first`, `_local`, `_only_node:xyz`, `_prefer_node:xyz`, and `_shards:2,3`, which are explained in detail on the -http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-preference.html[search `preference`] +{ref}/search-request-preference.html[search `preference`] documentation page. However, the most generally useful value is some arbitrary string, to avoid diff --git a/070_Index_Mgmt/10_Settings.asciidoc b/070_Index_Mgmt/10_Settings.asciidoc index 0410094cc..ac7373fa6 100644 --- a/070_Index_Mgmt/10_Settings.asciidoc +++ b/070_Index_Mgmt/10_Settings.asciidoc @@ -2,7 +2,7 @@ There are many many knobs((("index settings"))) that you can twiddle to customize index behavior, which you can read about in the -http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_index_settings.html#_index_settings[Index Modules reference documentation], +{ref}/index-modules.html[Index Modules reference documentation], but... TIP: Elasticsearch comes with good defaults. Don't twiddle these knobs until diff --git a/070_Index_Mgmt/20_Custom_Analyzers.asciidoc b/070_Index_Mgmt/20_Custom_Analyzers.asciidoc index cf4043c16..e930833c2 100644 --- a/070_Index_Mgmt/20_Custom_Analyzers.asciidoc +++ b/070_Index_Mgmt/20_Custom_Analyzers.asciidoc @@ -15,7 +15,7 @@ Character filters:: Character filters((("character filters"))) are used to ``tidy up'' a string before it is tokenized. For instance, if our text is in HTML format, it will contain HTML tags like `

` or `

` that we don't want to be indexed. We can use the -http://bit.ly/1B6f4Ay[`html_strip` character filter] +{ref}/analysis-htmlstrip-charfilter.html[`html_strip` character filter] to remove all HTML tags and to convert HTML entities like `Á` into the corresponding Unicode character `Á`. @@ -27,17 +27,17 @@ Tokenizers:: -- An analyzer _must_ have a single tokenizer.((("tokenizers", "in analyzers"))) The tokenizer breaks up the string into individual terms or tokens. The -http://bit.ly/1E3Fd1b[`standard` tokenizer], +{ref}/analysis-standard-tokenizer.html[`standard` tokenizer], which is used((("standard tokenizer"))) in the `standard` analyzer, breaks up a string into individual terms on word boundaries, and removes most punctuation, but other tokenizers exist that have different behavior. For instance, the -http://bit.ly/1ICd585[`keyword` tokenizer] +{ref}/analysis-keyword-tokenizer.html[`keyword` tokenizer] outputs exactly((("keyword tokenizer"))) the same string as it received, without any tokenization. The -http://bit.ly/1xt3t7d[`whitespace` tokenizer] +{ref}/analysis-whitespace-tokenizer.html[`whitespace` tokenizer] splits text((("whitespace tokenizer"))) on whitespace only. The -http://bit.ly/1ICdozA[`pattern` tokenizer] can +{ref}/analysis-pattern-tokenizer.html[`pattern` tokenizer] can be used to split text on a ((("pattern tokenizer")))matching regular expression. -- @@ -49,14 +49,14 @@ specified token filters,((("token filters"))) in the order in which they are spe Token filters may change, add, or remove tokens. We have already mentioned the http://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-lowercase-tokenizer.html[`lowercase`] and -http://bit.ly/1INX4tN[`stop` token filters], +{ref}/analysis-stop-tokenfilter.html[`stop` token filters], but there are many more available in Elasticsearch. -http://bit.ly/1AUfpDN[Stemming token filters] +{ref}/analysis-stemmer-tokenfilter.html[Stemming token filters] ``stem'' words to ((("stemming token filters")))their root form. The -http://bit.ly/1ylU7Q7[`ascii_folding` filter] +{ref}/analysis-asciifolding-tokenfilter.html[`ascii_folding` filter] removes diacritics,((("ascii_folding filter"))) converting a term like `"très"` into `"tres"`. The -http://bit.ly/1CbkmYe[`ngram`] and -http://bit.ly/1DIf6j5[`edge_ngram` token filters] can produce((("edge_engram token filter")))((("ngram and edge_ngram token filters"))) +{ref}/analysis-ngram-tokenfilter.html[`ngram`] and +{ref}/analysis-edgengram-tokenfilter.html[`edge_ngram` token filters] can produce((("edge_engram token filter")))((("ngram and edge_ngram token filters"))) tokens suitable for partial matching or autocomplete. -- diff --git a/070_Index_Mgmt/40_Custom_Dynamic_Mapping.asciidoc b/070_Index_Mgmt/40_Custom_Dynamic_Mapping.asciidoc index 313eaac8a..7d964577f 100644 --- a/070_Index_Mgmt/40_Custom_Dynamic_Mapping.asciidoc +++ b/070_Index_Mgmt/40_Custom_Dynamic_Mapping.asciidoc @@ -60,7 +60,7 @@ a `date` field, you have to add it manually. [NOTE] ==== Elasticsearch's idea of which strings look like dates can be altered -with the http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-root-object-type.html[`dynamic_date_formats` setting]. +with the {ref}/dynamic-field-mapping.html#date-detection[`dynamic_date_formats` setting]. ==== [[dynamic-templates]] @@ -140,4 +140,4 @@ The `unmatch` and `path_unmatch` patterns((("unmatch pattern")))((("path_unmap p that would otherwise match. More configuration options can be found in the -http://bit.ly/1wdHOzG[reference documentation for the root object]. +{ref}/dynamic-mapping.html[dynamic mapping documentation]. diff --git a/075_Inside_a_shard/50_Persistent_changes.asciidoc b/075_Inside_a_shard/50_Persistent_changes.asciidoc index f3f482170..2298577e3 100644 --- a/075_Inside_a_shard/50_Persistent_changes.asciidoc +++ b/075_Inside_a_shard/50_Persistent_changes.asciidoc @@ -83,10 +83,10 @@ image::images/elas_1109.png["After a flush, the segments are fully commited and The action of performing a commit and truncating the translog is known in Elasticsearch as a _flush_. ((("flushes"))) Shards are flushed automatically every 30 minutes, or when the translog becomes too big. See the -http://bit.ly/1E3HKbD[`translog` documentation] for settings +{ref}/index-modules-translog.html#_translog_settings[`translog` documentation] for settings that can be used((("translog (transaction log)", "flushes and"))) to control these thresholds: -The http://bit.ly/1ICgxiU[`flush` API] can ((("indices", "flushing")))((("flush API")))be used to perform a manual flush: +The {ref}/indices-flush.html[`flush` API] can ((("indices", "flushing")))((("flush API")))be used to perform a manual flush: [source,json] ----------------------------- diff --git a/080_Structured_Search/25_ranges.asciidoc b/080_Structured_Search/25_ranges.asciidoc index 429a96e46..13506aaad 100644 --- a/080_Structured_Search/25_ranges.asciidoc +++ b/080_Structured_Search/25_ranges.asciidoc @@ -117,7 +117,7 @@ math expression: Date math is _calendar aware_, so it knows the number of days in each month, days in a year, and so forth. More details about working with dates can be found in -the http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-date-format.html[date format reference documentation]. +the {ref}/mapping-date-format.html[date format reference documentation]. ==== Ranges on Strings diff --git a/080_Structured_Search/40_bitsets.asciidoc b/080_Structured_Search/40_bitsets.asciidoc index 9c465ca0a..d9b4aa144 100644 --- a/080_Structured_Search/40_bitsets.asciidoc +++ b/080_Structured_Search/40_bitsets.asciidoc @@ -78,7 +78,7 @@ doesn't make sense to do so: Script filters:: -The results((("script filters, no caching of results"))) from http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/filter-caching.html#_controlling_caching[`script` filters] cannot +The results((("script filters, no caching of results"))) from {ref}/query-dsl-script-query.html cannot be cached because the meaning of the script is opaque to Elasticsearch. Geo-filters:: diff --git a/100_Full_Text_Search/10_Multi_word_queries.asciidoc b/100_Full_Text_Search/10_Multi_word_queries.asciidoc index 43d09c5f5..dcb5fa7d9 100644 --- a/100_Full_Text_Search/10_Multi_word_queries.asciidoc +++ b/100_Full_Text_Search/10_Multi_word_queries.asciidoc @@ -148,7 +148,7 @@ must match for a document to be considered a match. The `minimum_should_match` parameter is flexible, and different rules can be applied depending on the number of terms the user enters. For the full documentation see the -http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-minimum-should-match.html#query-dsl-minimum-should-match +{ref}/query-dsl-minimum-should-match.html#query-dsl-minimum-should-match ==== To fully understand how the `match` query handles multiword queries, we need diff --git a/100_Full_Text_Search/30_Controlling_analysis.asciidoc b/100_Full_Text_Search/30_Controlling_analysis.asciidoc index 9b1061290..e9e842d28 100644 --- a/100_Full_Text_Search/30_Controlling_analysis.asciidoc +++ b/100_Full_Text_Search/30_Controlling_analysis.asciidoc @@ -194,6 +194,6 @@ setting instead. A common work flow for time based data like logging is to create a new index per day on the fly by just indexing into it. While this work flow prevents you from creating your index up front, you can still use -http://bit.ly/1ygczeq[index templates] +{ref}/indices-templates.html[index templates] to specify the settings and mappings that a new index should have. ==== diff --git a/130_Partial_Matching/35_Search_as_you_type.asciidoc b/130_Partial_Matching/35_Search_as_you_type.asciidoc index 72ce84ad8..99b7ab0e1 100644 --- a/130_Partial_Matching/35_Search_as_you_type.asciidoc +++ b/130_Partial_Matching/35_Search_as_you_type.asciidoc @@ -286,7 +286,7 @@ fast. However, sometimes it is not fast enough. Latency matters, especially when you are trying to provide instant feedback. Sometimes the fastest way of searching is not to search at all. -The http://bit.ly/1IChV5j[completion suggester] in +The {ref}/search-suggesters-completion.html[completion suggester] in Elasticsearch((("completion suggester"))) takes a completely different approach. You feed it a list of all possible completions, and it builds them into a _finite state transducer_, an((("Finite State Transducer"))) optimized data structure that resembles a big graph. To diff --git a/130_Partial_Matching/40_Compound_words.asciidoc b/130_Partial_Matching/40_Compound_words.asciidoc index 3180b1e80..7a4d18bfb 100644 --- a/130_Partial_Matching/40_Compound_words.asciidoc +++ b/130_Partial_Matching/40_Compound_words.asciidoc @@ -27,7 +27,7 @@ see ``Aussprachewörtebuch'' in the results list. Similarly, a search for ``Adler'' (eagle) should include ``Weißkopfseeadler.'' One approach to indexing languages like this is to break compound words into -their constituent parts using the http://bit.ly/1ygdjjC[compound word token filter]. +their constituent parts using the {ref}/analysis-compound-word-tokenfilter.html[compound word token filter]. However, the quality of the results depends on how good your compound-word dictionary is. diff --git a/170_Relevance/30_Not_quite_not.asciidoc b/170_Relevance/30_Not_quite_not.asciidoc index bd5308091..4e08f0bd0 100644 --- a/170_Relevance/30_Not_quite_not.asciidoc +++ b/170_Relevance/30_Not_quite_not.asciidoc @@ -34,7 +34,7 @@ too strict. [[boosting-query]] ==== boosting Query -The http://bit.ly/1IO281f[`boosting` query] solves((("boosting query")))((("relevance", "controlling", "boosting query"))) this problem. +The {ref}/query-dsl-boosting-query.html[`boosting` query] solves((("boosting query")))((("relevance", "controlling", "boosting query"))) this problem. It allows us to still include results that appear to be about the fruit or the pastries, but to downgrade them--to rank them lower than they would otherwise be: diff --git a/170_Relevance/35_Ignoring_TFIDF.asciidoc b/170_Relevance/35_Ignoring_TFIDF.asciidoc index 3201bc713..fe1c61f8c 100644 --- a/170_Relevance/35_Ignoring_TFIDF.asciidoc +++ b/170_Relevance/35_Ignoring_TFIDF.asciidoc @@ -39,7 +39,7 @@ isn't, `0`. [[constant-score-query]] ==== constant_score Query -Enter the http://bit.ly/1DIgSAK[`constant_score`] query. +Enter the {ref}/query-dsl-constant-score-query.html[`constant_score`] query. This ((("constant_score query")))query can wrap either a query or a filter, and assigns a score of `1` to any documents that match, regardless of TF/IDF: diff --git a/170_Relevance/40_Function_score_query.asciidoc b/170_Relevance/40_Function_score_query.asciidoc index 1aa465e76..3e6ec8cb8 100644 --- a/170_Relevance/40_Function_score_query.asciidoc +++ b/170_Relevance/40_Function_score_query.asciidoc @@ -1,7 +1,7 @@ [[function-score-query]] === function_score Query -The http://bit.ly/1sCKtHW[`function_score` query] is the +The {ref}/query-dsl-function-score-query.html[`function_score` query] is the ultimate tool for taking control of the scoring process.((("function_score query")))((("relevance", "controlling", "function_score query"))) It allows you to apply a function to each document that matches the main query in order to alter or completely replace the original query `_score`. diff --git a/170_Relevance/45_Popularity.asciidoc b/170_Relevance/45_Popularity.asciidoc index c217c5d21..f1938e7fa 100644 --- a/170_Relevance/45_Popularity.asciidoc +++ b/170_Relevance/45_Popularity.asciidoc @@ -110,7 +110,7 @@ GET /blogposts/post/_search The available modifiers are `none` (the default), `log`, `log1p`, `log2p`, `ln`, `ln1p`, `ln2p`, `square`, `sqrt`, and `reciprocal`. You can read more about them in the -http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html#_field_value_factor[`field_value_factor` documentation]. +{ref}/query-dsl-function-score-query.html#_field_value_factor[`field_value_factor` documentation]. ==== factor diff --git a/170_Relevance/65_Script_score.asciidoc b/170_Relevance/65_Script_score.asciidoc index 74e397055..c6ca3babf 100644 --- a/170_Relevance/65_Script_score.asciidoc +++ b/170_Relevance/65_Script_score.asciidoc @@ -106,7 +106,7 @@ a profit. The `script_score` function provides enormous flexibility.((("scripts", "performance and"))) Within a script, you have access to the fields of the document, to the current `_score`, and even to the term frequencies, inverse document frequencies, and field length -norms (see http://bit.ly/1E3Rbbh[Text scoring in scripts]). +norms (see {ref}/modules-advanced-scripting.html[Text scoring in scripts]). That said, scripts can have a performance impact. If you do find that your scripts are not quite fast enough, you have three options: @@ -115,7 +115,7 @@ scripts are not quite fast enough, you have three options: document. * Groovy is fast, but not quite as fast as Java.((("Java", "scripting in"))) You could reimplement your script as a native Java script. (See - http://bit.ly/1ynBidJ[Native Java Scripts]). + {ref}/modules-scripting.html#native-java-scripts[Native Java Scripts]). * Use the `rescore` functionality((("rescoring"))) described in <> to apply your script to only the best-scoring documents. diff --git a/170_Relevance/70_Pluggable_similarities.asciidoc b/170_Relevance/70_Pluggable_similarities.asciidoc index db93c33ee..da158f035 100644 --- a/170_Relevance/70_Pluggable_similarities.asciidoc +++ b/170_Relevance/70_Pluggable_similarities.asciidoc @@ -5,7 +5,7 @@ Before we move on from relevance and scoring, we will finish this chapter with a more advanced subject: pluggable similarity algorithms.((("similarity algorithms", "pluggable")))((("relevance", "controlling", "using pluggable similarity algorithms"))) While Elasticsearch uses the <> as its default similarity algorithm, it supports other algorithms out of the box, which are listed -in the http://bit.ly/14Eiw7f[Similarity Modules] documentation. +in the {ref}/index-modules-similarity.html#configuration[Similarity Modules] documentation. [[bm25]] ==== Okapi BM25 diff --git a/200_Language_intro/30_Language_pitfalls.asciidoc b/200_Language_intro/30_Language_pitfalls.asciidoc index 9ae51fb46..27713bcff 100644 --- a/200_Language_intro/30_Language_pitfalls.asciidoc +++ b/200_Language_intro/30_Language_pitfalls.asciidoc @@ -63,7 +63,7 @@ It is not sufficient just to think about your documents, though.((("queries", "m to think about how your users will query those documents. Often you will be able to identify the main language of the user either from the language of that user's chosen interface (for example, `mysite.de` versus `mysite.fr`) or from the -http://bit.ly/1BwEl61[`accept-language`] +http://www.w3.org/International/questions/qa-lang-priorities.en.php[`accept-language`] HTTP header from the user's browser. User searches also come in three main varieties: @@ -97,10 +97,10 @@ cases, you need to use a heuristic to identify the predominant language. Fortunately, libraries are available in several languages to help with this problem. Of particular note is the -http://bit.ly/1AUr3i2[chromium-compact-language-detector] +https://github.com/mikemccand/chromium-compact-language-detector[chromium-compact-language-detector] library from -http://bit.ly/1AUr85k[Mike McCandless], -which uses the open source (http://bit.ly/1u9KKgI[Apache License 2.0]) +http://blog.mikemccandless.com/2013/08/a-new-version-of-compact-language.html[Mike McCandless], +which uses the open source (http://www.apache.org/licenses/LICENSE-2.0[Apache License 2.0]) https://code.google.com/p/cld2/[Compact Language Detector] (CLD) from Google. It is small, fast, ((("Compact Language Detector (CLD)")))and accurate, and can detect 160+ languages from as little as two sentences. It can even detect multiple languages within a single block of diff --git a/200_Language_intro/50_One_language_per_field.asciidoc b/200_Language_intro/50_One_language_per_field.asciidoc index 200cb8240..ba4ebf9ec 100644 --- a/200_Language_intro/50_One_language_per_field.asciidoc +++ b/200_Language_intro/50_One_language_per_field.asciidoc @@ -61,8 +61,8 @@ Like the _index-per-language_ approach, the _field-per-language_ approach maintains clean term frequencies. It is not quite as flexible as having separate indices. Although it is easy to add a new field by using the <>, those new fields may require new custom analyzers, which can only be set up at index creation time. As a -workaround, you can http://bit.ly/1B6s0WY[close] the index, add the new -analyzers with the http://bit.ly/1zijFPx[`update-settings` API], +workaround, you can {ref}/indices-open-close.html[close] the index, add the new +analyzers with the {ref}/indices-update-settings.html[`update-settings` API], then reopen the index, but closing the index means that it will require some downtime. diff --git a/220_Token_normalization/60_Sorting_and_collations.asciidoc b/220_Token_normalization/60_Sorting_and_collations.asciidoc index 5c0e0b1f8..fdaebd136 100644 --- a/220_Token_normalization/60_Sorting_and_collations.asciidoc +++ b/220_Token_normalization/60_Sorting_and_collations.asciidoc @@ -327,7 +327,7 @@ German phonebooks:: ================================================== You can read more about the locales supported by ICU at: -http://bit.ly/1u9LEdp. +http://userguide.icu-project.org/locale. ================================================== diff --git a/230_Stemming/10_Algorithmic_stemmers.asciidoc b/230_Stemming/10_Algorithmic_stemmers.asciidoc index c461965fa..832940d7c 100644 --- a/230_Stemming/10_Algorithmic_stemmers.asciidoc +++ b/230_Stemming/10_Algorithmic_stemmers.asciidoc @@ -21,7 +21,7 @@ written in Snowball. [TIP] ================================================== -The http://bit.ly/1IObUjZ[`kstem` token filter] is a stemmer +The {ref}/analysis-kstem-tokenfilter.html[`kstem` token filter] is a stemmer for English which((("kstem token filter"))) combines the algorithmic approach with a built-in dictionary. The dictionary contains a list of root words and exceptions in order to avoid conflating words incorrectly. `kstem` tends to stem less @@ -32,18 +32,18 @@ aggressively than the Porter stemmer. ==== Using an Algorithmic Stemmer While you ((("stemming words", "algorithmic stemmers", "using")))can use the -http://bit.ly/17LseXy[`porter_stem`] or -http://bit.ly/1IObUjZ[`kstem`] token filter directly, or +{ref}/analysis-porterstem-tokenfilter.html[`porter_stem`] or +{ref}/analysis-kstem-tokenfilter.html[`kstem`] token filter directly, or create a language-specific Snowball stemmer with the -http://bit.ly/1Cr4tNI[`snowball`] token filter, all of the +{ref}/analysis-snowball-tokenfilter.html[`snowball`] token filter, all of the algorithmic stemmers are exposed via a single unified interface: -the http://bit.ly/1AUfpDN[`stemmer` token filter], which +the {ref}/analysis-stemmer-tokenfilter.html[`stemmer` token filter], which accepts the `language` parameter. For instance, perhaps you find the default stemmer used by the `english` analyzer to be too aggressive and ((("english analyzer", "default stemmer, examining")))you want to make it less aggressive. The first step is to look up the configuration for the `english` analyzer -in the http://bit.ly/1xtdoJV[language analyzers] +in the {ref}/analysis-lang-analyzer.html[language analyzers] documentation, which shows the following: [source,js] @@ -97,9 +97,9 @@ Having reviewed the current configuration, we can use it as the basis for a new analyzer, with((("english analyzer", "customizing the stemmer"))) the following changes: * Change the `english_stemmer` from `english` (which maps to the - http://bit.ly/17LseXy[`porter_stem`] token filter) + {ref}/analysis-porterstem-tokenfilter.html[`porter_stem`] token filter) to `light_english` (which maps to the less aggressive - http://bit.ly/1IObUjZ[`kstem`] token filter). + {ref}/analysis-kstem-tokenfilter.html[`kstem`] token filter). * Add the <> token filter to remove any diacritics from foreign words.((("asciifolding token filter"))) diff --git a/230_Stemming/30_Hunspell_stemmer.asciidoc b/230_Stemming/30_Hunspell_stemmer.asciidoc index a759860be..6fb7c772f 100644 --- a/230_Stemming/30_Hunspell_stemmer.asciidoc +++ b/230_Stemming/30_Hunspell_stemmer.asciidoc @@ -2,7 +2,7 @@ === Hunspell Stemmer Elasticsearch provides ((("dictionary stemmers", "Hunspell stemmer")))((("stemming words", "dictionary stemmers", "Hunspell stemmer")))dictionary-based stemming via the -http://bit.ly/1KNFdXI[`hunspell` token filter]. +{ref}/analysis-hunspell-tokenfilter.html[`hunspell` token filter]. Hunspell http://hunspell.sourceforge.net/[_hunspell.sourceforge.net_] is the spell checker used by Open Office, LibreOffice, Chrome, Firefox, Thunderbird, and many other open and closed source projects. @@ -245,4 +245,4 @@ PFX A 0 re . <4> combined with the suffix rules to form `reanalyzes`, `reanalyzed`, `reanalyzing`. -More information about the Hunspell syntax can be found on the http://bit.ly/1ynGhv6[Hunspell documentation site]. +More information about the Hunspell syntax can be found on the http://sourceforge.net/projects/hunspell/files/Hunspell/Documentation/[Hunspell documentation site]. diff --git a/230_Stemming/40_Choosing_a_stemmer.asciidoc b/230_Stemming/40_Choosing_a_stemmer.asciidoc index 882237aa6..440644f12 100644 --- a/230_Stemming/40_Choosing_a_stemmer.asciidoc +++ b/230_Stemming/40_Choosing_a_stemmer.asciidoc @@ -2,16 +2,16 @@ === Choosing a Stemmer The documentation for the -http://bit.ly/1AUfpDN[`stemmer`] token filter +{ref}/analysis-stemmer-tokenfilter.html[`stemmer`] token filter lists multiple stemmers for some languages.((("stemming words", "choosing a stemmer")))((("English", "stemmers for"))) For English we have the following: `english`:: - The http://bit.ly/17LseXy[`porter_stem`] token filter. + The {ref}/analysis-porterstem-tokenfilter.html[`porter_stem`] token filter. `light_english`:: - The http://bit.ly/1IObUjZ[`kstem`] token filter. + The {ref}/analysis-kstem-tokenfilter.html[`kstem`] token filter. `minimal_english`:: @@ -19,19 +19,19 @@ lists multiple stemmers for some languages.((("stemming words", "choosing a stem `lovins`:: - The http://bit.ly/1Cr4tNI[Snowball] based - http://bit.ly/1ICyTjR[Lovins] + The {ref}/analysis-snowball-tokenfilter.html[Snowball] based + http://snowball.tartarus.org/algorithms/lovins/stemmer.html[Lovins] stemmer, the first stemmer ever produced. `porter`:: - The http://bit.ly/1Cr4tNI[Snowball] based - http://bit.ly/1sCWihj[Porter] stemmer + The {ref}/analysis-snowball-tokenfilter.html[Snowball] based + http://snowball.tartarus.org/algorithms/porter/stemmer.html[Porter] stemmer `porter2`:: - The http://bit.ly/1Cr4tNI[Snowball] based - http://bit.ly/1zip3lK[Porter2] stemmer + The {ref}/analysis-snowball-tokenfilter.html[Snowball] based + http://snowball.tartarus.org/algorithms/english/stemmer.html[Porter2] stemmer `possessive_english`:: diff --git a/230_Stemming/50_Controlling_stemming.asciidoc b/230_Stemming/50_Controlling_stemming.asciidoc index 7565292f3..8b228efbb 100644 --- a/230_Stemming/50_Controlling_stemming.asciidoc +++ b/230_Stemming/50_Controlling_stemming.asciidoc @@ -8,8 +8,8 @@ your use case, it is important to keep `skies` and `skiing` as distinct words rather than stemming them both down to `ski` (as would happen with the `english` analyzer). -The http://bit.ly/1IOeXZD[`keyword_marker`] and -http://bit.ly/1ymcioJ[`stemmer_override`] token filters((("stemmer_override token filter")))((("keyword_marker token filter"))) +The {ref}/analysis-keyword-marker-tokenfilter.html[`keyword_marker`] and +{ref}/analysis-stemmer-override-tokenfilter.html[`stemmer_override`] token filters((("stemmer_override token filter")))((("keyword_marker token filter"))) allow us to customize the stemming process. [[preventing-stemming]] @@ -18,12 +18,12 @@ allow us to customize the stemming process. The <> parameter for language analyzers (see <>) allowed ((("stemming words", "controlling stemming", "preventing stemming")))us to specify a list of words that should not be stemmed. Internally, these language analyzers use the -http://bit.ly/1IOeXZD[`keyword_marker` token filter] +{ref}/analysis-keyword-marker-tokenfilter.html[`keyword_marker` token filter] to mark the listed words as _keywords_, which prevents subsequent stemming token filters from touching those words.((("keyword_marker token filter", "preventing stemming of certain words"))) For instance, we can create a simple custom analyzer that uses the -http://bit.ly/17LseXy[`porter_stem`] token filter, +{ref}/analysis-porterstem-tokenfilter.html[`porter_stem`] token filter, but prevents the word `skies` from((("porter_stem token filter"))) being stemmed: [source,json] @@ -83,7 +83,7 @@ file. In the preceding example, we prevented `skies` from being stemmed, but perhaps we would prefer it to be stemmed to `sky` instead.((("stemming words", "controlling stemming", "customizing stemming"))) The -http://bit.ly/1ymcioJ[`stemmer_override`] token +{ref}/analysis-stemmer-override-tokenfilter.html[`stemmer_override`] token filter allows us ((("stemmer_override token filter")))to specify our own custom stemming rules. At the same time, we can handle some irregular forms like stemming `mice` to `mouse` and `feet` to `foot`: diff --git a/230_Stemming/60_Stemming_in_situ.asciidoc b/230_Stemming/60_Stemming_in_situ.asciidoc index 3db7034b4..8670c7a6d 100644 --- a/230_Stemming/60_Stemming_in_situ.asciidoc +++ b/230_Stemming/60_Stemming_in_situ.asciidoc @@ -39,7 +39,7 @@ Pos 4: (jumped,jump) To prevent the useless repetition of terms that are the same in their stemmed and unstemmed forms, we add the -http://bit.ly/1B6xHUY[`unique`] token filter((("unique token filter"))) into the mix: +{ref}/analysis-unique-tokenfilter.html[`unique`] token filter((("unique token filter"))) into the mix: [source,json] ------------------------------------ diff --git a/240_Stopwords/20_Using_stopwords.asciidoc b/240_Stopwords/20_Using_stopwords.asciidoc index d39afc959..4fa4e438a 100644 --- a/240_Stopwords/20_Using_stopwords.asciidoc +++ b/240_Stopwords/20_Using_stopwords.asciidoc @@ -2,22 +2,22 @@ === Using Stopwords The removal of stopwords is ((("stopwords", "removal of")))handled by the -http://bit.ly/1INX4tN[`stop` token filter] which can be used +{ref}/analysis-stop-tokenfilter.html[`stop` token filter] which can be used when ((("stop token filter")))creating a `custom` analyzer (see <>). However, some out-of-the-box analyzers((("analyzers", "stop filter pre-integrated")))((("pattern analyzer", "stopwords and")))((("standard analyzer", "stop filter")))((("language analyzers", "stop filter pre-integrated"))) come with the `stop` filter pre-integrated: -http://bit.ly/1xtdoJV[Language analyzers]:: +{ref}/analysis-lang-analyzer.html[Language analyzers]:: Each language analyzer defaults to using the appropriate stopwords list for that language. For instance, the `english` analyzer uses the `_english_` stopwords list. -http://bit.ly/14EpXv3[`standard` analyzer]:: +{ref}/analysis-standard-analyzer.html[`standard` analyzer]:: Defaults to the empty stopwords list: `_none_`, essentially disabling stopwords. -http://bit.ly/1u9OVct[`pattern` analyzer]:: +{ref}/analysis-pattern-analyzer.html[`pattern` analyzer]:: Defaults to `_none_`, like the `standard` analyzer. @@ -112,7 +112,7 @@ The default stopword list for a particular language can be specified using the TIP: The predefined language-specific stopword((("languages", "predefined stopword lists for"))) lists available in Elasticsearch can be found in the -http://bit.ly/157YLFy[`stop` token filter] documentation. +{ref}/analysis-stop-tokenfilter.html[`stop` token filter] documentation. Stopwords can be disabled by ((("stopwords", "disabling")))specifying the special list: `_none_`. For instance, to use the `english` analyzer((("english analyzer", "using without stopwords"))) without stopwords, you can do the @@ -163,7 +163,7 @@ PUT /my_index [[stop-token-filter]] ==== Using the stop Token Filter -The http://bit.ly/1AUzDNI[`stop` token filter] can be combined +The {ref}/analysis-stop-tokenfilter.html[`stop` token filter] can be combined with a tokenizer((("stopwords", "using stop token filter")))((("stop token filter", "using in custom analyzer"))) and other token filters when you need to create a `custom` analyzer. For instance, let's say that we wanted to ((("Spanish", "custom analyzer for")))((("light_spanish stemmer")))create a Spanish analyzer with the following: @@ -226,7 +226,7 @@ node is restarted, or when a closed index is reopened. If you specify stopwords inline with the `stopwords` parameter, your only option is to close the index and update the analyzer configuration with the -http://bit.ly/1zijFPx[update index settings API], then reopen +{ref}/indices-update-settings.html#update-settings-analysis[update index settings API], then reopen the index. Updating stopwords is easier if you specify them in a file with the @@ -234,7 +234,7 @@ Updating stopwords is easier if you specify them in a file with the the cluster) and then force the analyzers to be re-created by either of these actions: * Closing and reopening the index - (see http://bit.ly/1B6s0WY[open/close index]), or + (see {ref}/indices-open-close.html[open/close index]), or * Restarting each node in the cluster, one by one Of course, updating the stopwords list will not change any documents that have diff --git a/240_Stopwords/40_Divide_and_conquer.asciidoc b/240_Stopwords/40_Divide_and_conquer.asciidoc index 03493e211..e7a3c524b 100644 --- a/240_Stopwords/40_Divide_and_conquer.asciidoc +++ b/240_Stopwords/40_Divide_and_conquer.asciidoc @@ -188,5 +188,5 @@ documents that have 75% of all high-frequency terms with a query like this: } --------------------------------- -See the http://bit.ly/1wdS2Qo[`common` terms query] reference page for more options. +See the {ref}/query-dsl-common-terms-query.html[`common` terms query] reference page for more options. diff --git a/260_Synonyms/20_Using_synonyms.asciidoc b/260_Synonyms/20_Using_synonyms.asciidoc index 8f47b193b..3c89c7c58 100644 --- a/260_Synonyms/20_Using_synonyms.asciidoc +++ b/260_Synonyms/20_Using_synonyms.asciidoc @@ -2,7 +2,7 @@ === Using Synonyms Synonyms can replace existing tokens or((("synonyms", "using"))) be added to the token stream by using the((("synonym token filter"))) -http://bit.ly/1DInEGD[`synonym` token filter]: +{ref}/analysis-synonym-tokenfilter.html[`synonym` token filter]: [source,json] ------------------------------------- diff --git a/260_Synonyms/60_Multi_word_synonyms.asciidoc b/260_Synonyms/60_Multi_word_synonyms.asciidoc index 5a4d421f7..5b6ec7d91 100644 --- a/260_Synonyms/60_Multi_word_synonyms.asciidoc +++ b/260_Synonyms/60_Multi_word_synonyms.asciidoc @@ -168,8 +168,8 @@ frequently lead to surprising results or even syntax errors. One of the gotchas of this query involves multiword synonyms. To support its search-syntax, it has to parse the query string to recognize special operators like `AND`, `OR`, `+`, `-`, `field:`, and so forth. (See the full -http://bit.ly/151G5I1[`query_string` syntax] -here.) +{ref}/query-dsl-query-string-query.html#query-string-syntax[`query_string` syntax] +for more information.) As part of this parsing process, it breaks up the query string on whitespace, and passes each word that it finds to the relevant analyzer separately. This diff --git a/260_Synonyms/70_Symbol_synonyms.asciidoc b/260_Synonyms/70_Symbol_synonyms.asciidoc index 8178bfc2b..cb397c8e3 100644 --- a/260_Synonyms/70_Symbol_synonyms.asciidoc +++ b/260_Synonyms/70_Symbol_synonyms.asciidoc @@ -18,7 +18,7 @@ The `standard` tokenizer would simply strip out the emoticon in the second sentence, conflating two sentences that have quite different intent. We can use the -http://bit.ly/1ziua5n[`mapping` character filter] +{ref}/analysis-mapping-charfilter.html[`mapping` character filter] to replace emoticons((("mapping character filter", "replacing emoticons with symbol synonyms")))((("emoticons", "replacing with symbol synonyms"))) with symbol synonyms like `emoticon_happy` and `emoticon_sad` before the text is passed to the tokenizer: @@ -64,4 +64,4 @@ have used real words, like `happy` and `sad`. TIP: The `mapping` character filter is useful for simple replacements of exact character sequences. ((("mapping character filter", "replacements of exact character sequences")))For more-flexible pattern matching, you can use regular expressions with the -http://bit.ly/1DK4hgy[`pattern_replace` character filter]. +{ref}/analysis-pattern-replace-charfilter.html[`pattern_replace` character filter]. diff --git a/270_Fuzzy_matching/20_Fuzziness.asciidoc b/270_Fuzzy_matching/20_Fuzziness.asciidoc index 820ea8539..4a6048493 100644 --- a/270_Fuzzy_matching/20_Fuzziness.asciidoc +++ b/270_Fuzzy_matching/20_Fuzziness.asciidoc @@ -28,7 +28,7 @@ following steps: 3. Transpose `a` and `e`: b_ae_ver -> b_ea_ver These three steps represent a -http://bit.ly/1ymgZPB[Damerau-Levenshtein edit distance] +https://en.wikipedia.org/wiki/Damerau–Levenshtein_distance[Damerau-Levenshtein edit distance] of 3. Clearly, `bieber` is a long way from `beaver`—they are too far apart to be diff --git a/270_Fuzzy_matching/30_Fuzzy_query.asciidoc b/270_Fuzzy_matching/30_Fuzzy_query.asciidoc index 28d669220..fb8c4b46a 100644 --- a/270_Fuzzy_matching/30_Fuzzy_query.asciidoc +++ b/270_Fuzzy_matching/30_Fuzzy_query.asciidoc @@ -1,7 +1,7 @@ [[fuzzy-query]] === Fuzzy Query -The http://bit.ly/1ymh8Cu[`fuzzy` query] is ((("typoes and misspellings", "fuzzy query")))((("fuzzy queries")))the fuzzy equivalent of +The {ref}/query-dsl-fuzzy-query.html[`fuzzy` query] is ((("typoes and misspellings", "fuzzy query")))((("fuzzy queries")))the fuzzy equivalent of the `term` query. You will seldom use it directly yourself, but understanding how it works will help you to use fuzziness in the higher-level `match` query. diff --git a/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc b/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc index 493306353..f45176495 100644 --- a/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc +++ b/270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc @@ -27,7 +27,7 @@ without interfering with the relevance scoring of nonfuzzy queries. Fuzzy queries alone are much less useful than they initially appear. They are better used as part of a ``bigger'' feature, such as the _search-as-you-type_ -http://bit.ly/1IChV5j[`completion` suggester] or the -_did-you-mean_ http://bit.ly/1IOj5ZG[`phrase` suggester]. +{ref}/search-suggesters-completion.html[`completion` suggester] or the +_did-you-mean_ {ref}/search-suggesters-phrase.html[`phrase` suggester]. ================================================== diff --git a/270_Fuzzy_matching/60_Phonetic_matching.asciidoc b/270_Fuzzy_matching/60_Phonetic_matching.asciidoc index 0fe04e6de..6e2fd59b6 100644 --- a/270_Fuzzy_matching/60_Phonetic_matching.asciidoc +++ b/270_Fuzzy_matching/60_Phonetic_matching.asciidoc @@ -13,7 +13,7 @@ http://en.wikipedia.org/wiki/Metaphone#Double_Metaphone[Double Metaphone] (which expands phonetic matching to languages other than English), http://en.wikipedia.org/wiki/Caverphone[Caverphone] for matching names in New Zealand, the -http://bit.ly/1E47qoB[Beider-Morse] algorithm, which adopts the Soundex algorithm +https://en.wikipedia.org/wiki/Daitch–Mokotoff_Soundex#Beider.E2.80.93Morse_Phonetic_Name_Matching_Algorithm[Beider-Morse] algorithm, which adopts the Soundex algorithm for better matching of German and Yiddish names, and the http://de.wikipedia.org/wiki/K%C3%B6lner_Phonetik[Kölner Phonetik] for better handling of German words. @@ -25,7 +25,7 @@ purposes, and in combination with other techniques, phonetic matching can be a useful tool. First, you will need to install ((("Phonetic Analysis plugin")))the Phonetic Analysis plug-in from -http://bit.ly/1CreKJQ on every node +https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-phonetic.html on every node in the cluster, and restart each node. Then, you can create a custom analyzer that uses one of the diff --git a/300_Aggregations/100_circuit_breaker_fd_settings.asciidoc b/300_Aggregations/100_circuit_breaker_fd_settings.asciidoc index c9ed3be1c..c0e07aff5 100644 --- a/300_Aggregations/100_circuit_breaker_fd_settings.asciidoc +++ b/300_Aggregations/100_circuit_breaker_fd_settings.asciidoc @@ -137,7 +137,7 @@ Fielddata usage can be monitored: GET /_stats/fielddata?fields=* ------------------------------- -* per-node using the http://bit.ly/1586yDn[`nodes-stats` API]: +* per-node using the {ref}/current/cluster-nodes-stats.html[`nodes-stats` API]: + [source,json] ------------------------------- diff --git a/300_Aggregations/115_eager.asciidoc b/300_Aggregations/115_eager.asciidoc index ca72562ae..7d55bee3f 100644 --- a/300_Aggregations/115_eager.asciidoc +++ b/300_Aggregations/115_eager.asciidoc @@ -277,7 +277,7 @@ In practice, select a handful of queries that represent the majority of your user's queries and register those. ==== -Some administrative details (such as getting existing warmers and deleting warmers) that have been omitted from this explanation. Refer to the http://bit.ly/1AUGwys[warmers documentation] for the rest +Some administrative details (such as getting existing warmers and deleting warmers) that have been omitted from this explanation. Refer to the {ref}/indices-warmers.html[warmers documentation] for the rest of the details. diff --git a/300_Aggregations/20_basic_example.asciidoc b/300_Aggregations/20_basic_example.asciidoc index 735d358d6..d0437ada2 100644 --- a/300_Aggregations/20_basic_example.asciidoc +++ b/300_Aggregations/20_basic_example.asciidoc @@ -8,8 +8,7 @@ the syntax is fairly trivial. [NOTE] ========================= -A complete list of aggregation buckets and metrics can be found at the http://bit.ly/1KNL1R3[online -reference documentation]. We'll cover many of them in this chapter, but glance +A complete list of aggregation buckets and metrics can be found at the {ref}/search-aggregations.html[Elasticsearch Reference]. We'll cover many of them in this chapter, but glance over it after finishing so you are familiar with the full range of capabilities. ========================= diff --git a/300_Aggregations/65_percentiles.asciidoc b/300_Aggregations/65_percentiles.asciidoc index 3b0d541c3..cf0861157 100644 --- a/300_Aggregations/65_percentiles.asciidoc +++ b/300_Aggregations/65_percentiles.asciidoc @@ -307,7 +307,7 @@ clearly is not possible when you have billions of values distributed across dozens of nodes. Instead, `percentiles` uses an algorithm called((("TDigest algorithm"))) TDigest (introduced by Ted Dunning -in http://bit.ly/1DIpOWK[Computing Extremely Accurate Quantiles Using T-Digests]). As with HyperLogLog, it isn't +in https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf[Computing Extremely Accurate Quantiles Using T-Digests]). As with HyperLogLog, it isn't necessary to understand the full technical details, but it is good to know the properties of the algorithm: diff --git a/300_Aggregations/75_sigterms.asciidoc b/300_Aggregations/75_sigterms.asciidoc index d11dfb4c4..1072cffb8 100644 --- a/300_Aggregations/75_sigterms.asciidoc +++ b/300_Aggregations/75_sigterms.asciidoc @@ -16,7 +16,7 @@ PUT /_snapshot/sigterms <1> { "type": "url", "settings": { - "url": "http://download.elasticsearch.org/definitiveguide/sigterms_demo/" + "url": "http://download.elastic.co/definitiveguide/sigterms_demo/" } } diff --git a/310_Geopoints/34_Geo_distance.asciidoc b/310_Geopoints/34_Geo_distance.asciidoc index b3e847f7c..e36ed0a96 100644 --- a/310_Geopoints/34_Geo_distance.asciidoc +++ b/310_Geopoints/34_Geo_distance.asciidoc @@ -24,7 +24,7 @@ GET /attractions/restaurant/_search } --------------------- <1> Find all `location` fields within `1km` of the specified point. - See http://bit.ly/1ynS64j[Distance Units] for + See {ref}/common-options.html#distance-units[Distance Units] for a list of the accepted units. <2> The central point can be specified as a string, an array, or (as in this example) an object. See <>. diff --git a/320_Geohashes/40_Geohashes.asciidoc b/320_Geohashes/40_Geohashes.asciidoc index 6777af478..e756ab090 100644 --- a/320_Geohashes/40_Geohashes.asciidoc +++ b/320_Geohashes/40_Geohashes.asciidoc @@ -47,6 +47,6 @@ along with the approximate dimensions of each geohash cell: |gcpuuz94kkp5 |12 | ~ 3.7cm x 1.8cm |============================================= -The http://bit.ly/1DIqyex[`geohash_cell` filter] can use +The {ref}/query-dsl-geohash-cell-query.html[`geohash_cell` filter] can use these geohash prefixes((("geohash_cell filter")))((("filters", "geohash_cell"))) to find locations near a specified `lat/lon` point. diff --git a/340_Geoshapes/74_Indexing_geo_shapes.asciidoc b/340_Geoshapes/74_Indexing_geo_shapes.asciidoc index d82b142ba..76ac710f9 100644 --- a/340_Geoshapes/74_Indexing_geo_shapes.asciidoc +++ b/340_Geoshapes/74_Indexing_geo_shapes.asciidoc @@ -56,6 +56,6 @@ GeoJSON syntax is quite simple: ... ] -See the http://bit.ly/1G2nMCT[Geo-shape mapping documentation] for +See the {ref}/geo-shape.html[Geo-shape mapping documentation] for more details about the supported shapes. diff --git a/340_Geoshapes/76_Querying_geo_shapes.asciidoc b/340_Geoshapes/76_Querying_geo_shapes.asciidoc index 0dbaceca2..581a50697 100644 --- a/340_Geoshapes/76_Querying_geo_shapes.asciidoc +++ b/340_Geoshapes/76_Querying_geo_shapes.asciidoc @@ -1,9 +1,7 @@ [[querying-geo-shapes]] === Querying geo-shapes -The unusual thing ((("geo-shapes", "querying")))about the http://bit.ly/1AjFrxE[`geo_shape` -query] and http://bit.ly/1G2ocsZ[`geo_shape` filter] is that -they allow us to query using shapes, rather than just points. +The unusual thing ((("geo-shapes", "querying")))about the {ref}/query-dsl-geo-shape-query.html[`geo_shape` query] is that it allows us to query and filter using shapes, rather than just points. For instance, if our user steps out of the central train station in Amsterdam, we could find all landmarks within a 1km radius with a query like this: diff --git a/400_Relationships/22_Top_hits.asciidoc b/400_Relationships/22_Top_hits.asciidoc index 5fbe8e7a8..4a620a57a 100644 --- a/400_Relationships/22_Top_hits.asciidoc +++ b/400_Relationships/22_Top_hits.asciidoc @@ -74,7 +74,7 @@ PUT /my_index/blogpost/4 Now we can run a query looking for blog posts about `relationships`, by users called `John`, and group the results by user, thanks to the -http://bit.ly/1CrlWFQ[`top_hits` aggregation]: +{ref}/search-aggregations-metrics-top-hits-aggregation.html[`top_hits` aggregation]: [source,json] -------------------------------- diff --git a/400_Relationships/25_Concurrency.asciidoc b/400_Relationships/25_Concurrency.asciidoc index 9651a4e9e..d5de8f754 100644 --- a/400_Relationships/25_Concurrency.asciidoc +++ b/400_Relationships/25_Concurrency.asciidoc @@ -61,7 +61,7 @@ To support this, we need to index the path hierarchy: * `/clinton/projects/elasticsearch` This hierarchy can be generated ((("path_hierarchy tokenizer")))automatically from the `path` field using the -http://bit.ly/1AjGltZ[`path_hierarchy` tokenizer]: +{ref}/analysis-pathhierarchy-tokenizer.html[`path_hierarchy` tokenizer]: [source,json] -------------------------- @@ -78,8 +78,7 @@ PUT /fs } } -------------------------- -<1> The custom `paths` analyzer uses the `path_hierarchy` tokenizer with its - default settings. See http://bit.ly/1AjGltZ[`path_hierarchy` tokenizer]. +<1> The custom `paths` analyzer uses the {ref}/analysis-pathhierarchy-tokenizer.html[`path_hierarchy` tokenizer] with its default settings. The mapping for the `file` type would look like this: diff --git a/402_Nested/31_Nested_mapping.asciidoc b/402_Nested/31_Nested_mapping.asciidoc index 76c49b765..a9475386a 100644 --- a/402_Nested/31_Nested_mapping.asciidoc +++ b/402_Nested/31_Nested_mapping.asciidoc @@ -30,5 +30,5 @@ PUT /my_index That's all that is required. Any `comments` objects would now be indexed as separate nested documents. See the -http://bit.ly/1KNQEP9[`nested` type reference docs] for more. +{ref}/nested.html[`nested` type reference docs] for more. diff --git a/402_Nested/32_Nested_query.asciidoc b/402_Nested/32_Nested_query.asciidoc index da7638237..16b6b0cda 100644 --- a/402_Nested/32_Nested_query.asciidoc +++ b/402_Nested/32_Nested_query.asciidoc @@ -3,8 +3,7 @@ Because nested objects ((("nested objects", "querying")))are indexed as separate hidden documents, we can't query them directly. ((("queries", "nested"))) Instead, we have to use the -http://bit.ly/1ziFQoR[`nested` query] or -http://bit.ly/1IOp94r[`nested` filter] to access them: +{ref}/query-dsl-nested-query.html[`nested` query] to access them: [source,json] -------------------------- diff --git a/410_Scaling/40_Multiple_indices.asciidoc b/410_Scaling/40_Multiple_indices.asciidoc index c69d9ecb6..248a10c4d 100644 --- a/410_Scaling/40_Multiple_indices.asciidoc +++ b/410_Scaling/40_Multiple_indices.asciidoc @@ -71,8 +71,8 @@ to point to only the new index. A document `GET` request, like((("HTTP methods", "GET")))((("GET method"))) an indexing request, can target only one index. This makes retrieving a document by ID a bit more complicated in this scenario. Instead, run a search request with the -http://bit.ly/1C4Q0cf[`ids` query], or do a((("mget (multi-get) API"))) -http://bit.ly/1sDd2EX[`multi-get`] request on `tweets_1` and `tweets_2`. +{ref}/query-dsl-ids-query.html[`ids` query], or do a((("mget (multi-get) API"))) +{ref}/docs-multi-get.html[`multi-get`] request on `tweets_1` and `tweets_2`. ================================================== diff --git a/410_Scaling/45_Index_per_timeframe.asciidoc b/410_Scaling/45_Index_per_timeframe.asciidoc index 12d739732..b247a8bfc 100644 --- a/410_Scaling/45_Index_per_timeframe.asciidoc +++ b/410_Scaling/45_Index_per_timeframe.asciidoc @@ -5,10 +5,10 @@ One of the most common use cases for Elasticsearch is for logging,((("logging", in fact that Elasticsearch provides an integrated((("ELK stack"))) logging platform called the _ELK stack_—Elasticsearch, Logstash, and Kibana--to make the process easy. -http://www.elasticsearch.org/overview/logstash[Logstash] collects, parses, and +https://www.elastic.co/guide/en/logstash/current/index.html[Logstash] collects, parses, and enriches logs before indexing them into Elasticsearch.((("Logstash"))) Elasticsearch acts as a centralized logging server, and -http://www.elasticsearch.org/overview/kibana[Kibana] is a((("Kibana"))) graphic frontend +https://www.elastic.co/guide/en/kibana/current/index.html[Kibana] is a((("Kibana"))) graphic frontend that makes it easy to query and visualize what is happening across your network in near real-time. diff --git a/410_Scaling/55_Retiring_data.asciidoc b/410_Scaling/55_Retiring_data.asciidoc index 5880fdbad..929de6cb5 100644 --- a/410_Scaling/55_Retiring_data.asciidoc +++ b/410_Scaling/55_Retiring_data.asciidoc @@ -91,7 +91,7 @@ POST /logs_2014-09-30/_settings Of course, without replicas, we run the risk of losing data if a disk suffers catastrophic failure. You may((("snapshot-restore API"))) want to back up the data first, with the -http://bit.ly/14ED13A[`snapshot-restore` API]. +{ref}/modules-snapshots.html[`snapshot-restore` API]. [[close-indices]] ==== Closing Old Indices @@ -123,7 +123,7 @@ POST /logs_2014-01-*/_open <3> Finally, very old indices ((("indices", "archiving old indices")))can be archived off to some long-term storage like a shared disk or Amazon's S3 using the -http://bit.ly/14ED13A[`snapshot-restore` API], just in case you may need +{ref}/modules-snapshots.html[`snapshot-restore` API], just in case you may need to access them in the future. Once a backup exists, the index can be deleted from the cluster. diff --git a/410_Scaling/80_Scale_is_not_infinite.asciidoc b/410_Scaling/80_Scale_is_not_infinite.asciidoc index 213ab2581..2e0cfdfdf 100644 --- a/410_Scaling/80_Scale_is_not_infinite.asciidoc +++ b/410_Scaling/80_Scale_is_not_infinite.asciidoc @@ -83,5 +83,5 @@ small and agile. Eventually, despite your best intentions, you may find that the number of nodes and indices and mappings that you have is just too much for one cluster. At this stage, it is probably worth dividing the problem into multiple -clusters. Thanks to http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-tribe.html[`tribe` nodes], you can even run +clusters. Thanks to {ref}/modules-tribe.html[`tribe` nodes], you can even run searches across multiple clusters, as if they were one big cluster. diff --git a/520_Post_Deployment/10_dynamic_settings.asciidoc b/520_Post_Deployment/10_dynamic_settings.asciidoc index ac3b79cbd..caffdf012 100644 --- a/520_Post_Deployment/10_dynamic_settings.asciidoc +++ b/520_Post_Deployment/10_dynamic_settings.asciidoc @@ -35,5 +35,5 @@ PUT /_cluster/settings restart. A complete list of settings that can be updated dynamically can be found in the -http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-update-settings.html[online reference docs]. +{ref}/cluster-update-settings.html[online reference docs]. diff --git a/520_Post_Deployment/30_indexing_perf.asciidoc b/520_Post_Deployment/30_indexing_perf.asciidoc index 0ad5e5fd7..5f08fdcac 100644 --- a/520_Post_Deployment/30_indexing_perf.asciidoc +++ b/520_Post_Deployment/30_indexing_perf.asciidoc @@ -179,7 +179,7 @@ This is much more efficient than duplicating the indexing process. functionality.((("id", "auto-ID functionality of Elasticsearch"))) It is optimized to avoid version lookups, since the autogenerated ID is unique. -- If you are using your own ID, try to pick an ID that is http://bit.ly/1sDiR5t[friendly to Lucene]. ((("UUIDs (universally unique identifiers)"))) Examples include zero-padded +- If you are using your own ID, try to pick an ID that is http://blog.mikemccandless.com/2014/05/choosing-fast-unique-identifier-uuid.html[friendly to Lucene]. ((("UUIDs (universally unique identifiers)"))) Examples include zero-padded sequential IDs, UUID-1, and nanotime; these IDs have consistent, sequential patterns that compress well. In contrast, IDs such as UUID-4 are essentially random, which offer poor compression and slow down Lucene. diff --git a/Preface.asciidoc b/Preface.asciidoc index 031c4bcad..75f50e868 100644 --- a/Preface.asciidoc +++ b/Preface.asciidoc @@ -1,4 +1,3 @@ -:ref: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/ [preface] == Preface @@ -84,9 +83,9 @@ Elasticsearch available at the time of going to print--version 1.4.0--but Elasticsearch is a rapidly evolving project. The online version of this book will be updated as Elasticsearch changes. -You can find the latest version of this http://www.elasticsearch.org/guide/[book online]. +You can find the latest version of this https://www.elastic.co/guide/en/elasticsearch/guide/current/[book online]. -You can also track the changes that have been made by visiting the https://github.com/elasticsearch/elasticsearch-definitive-guide/[GitHub repository]. +You can also track the changes that have been made by visiting the https://github.com/elastic/elasticsearch-definitive-guide/[GitHub repository]. === How to Read This Book @@ -208,16 +207,16 @@ endif::es_build[] There are three topics that we do not cover in this book, because they are evolving rapidly and anything we write will soon be out-of-date: -* Highlighting of result snippets: see http://bit.ly/151kOhG[Highlighting]. -* _Did-you-mean_ and _search-as-you-type_ suggesters: see http://bit.ly/1INTMa9[Suggesters]. -* Percolation--finding queries which match a document: see http://bit.ly/1KNs3du[Percolators]. +* Highlighting of result snippets: see {ref}/search-request-highlighting.html[Highlighting]. +* _Did-you-mean_ and _search-as-you-type_ suggesters: see {ref}/search-suggesters.html[Suggesters]. +* Percolation--finding queries which match a document: see {ref}/_percolator_2.html[Percolators]. === Online Resources Because this book focuses on problem solving in Elasticsearch rather than syntax, we sometimes reference the existing documentation for a complete -list of parameters. The reference documentation can be found here: +list of parameters. The reference documentation can be found at: -http://www.elasticsearch.org/guide/ +https://www.elastic.co/guide/ === Conventions Used in This Book @@ -326,8 +325,8 @@ picked up the slack, put up with our absence and our endless moaning about how long the book was taking, and, most importantly, they are still here. Thank you to Shay Banon for creating Elasticsearch in the first place, and to -Elasticsearch the company for supporting our work on the book. Our colleagues -at Elasticsearch deserve a big thank you as well. They have helped us pick +Elastic the company for supporting our work on the book. Our colleagues +at Elastic deserve a big thank you as well. They have helped us pick through the innards of Elasticsearch to really understand how it works, and they have been responsible for adding improvements and fixing inconsistencies that were brought to light by writing about them. diff --git a/book.asciidoc b/book.asciidoc index 510a93773..42993718c 100644 --- a/book.asciidoc +++ b/book.asciidoc @@ -1,5 +1,6 @@ :bookseries: animal :es_build: 1 +:ref: https://www.elastic.co/guide/en/elasticsearch/reference/current = Elasticsearch: The Definitive Guide diff --git a/snippets/300_Aggregations/75_sigterms.json b/snippets/300_Aggregations/75_sigterms.json index 32e9daca9..d64d075a4 100644 --- a/snippets/300_Aggregations/75_sigterms.json +++ b/snippets/300_Aggregations/75_sigterms.json @@ -3,7 +3,7 @@ PUT /_snapshot/sigterms { "type": "url", "settings": { - "url": "http://download.elasticsearch.org/definitiveguide/sigterms_demo/" + "url": "http://download.elastic.co/definitiveguide/sigterms_demo/" } }