INTELLECTPERITUS KNOWLEDGE SPHERE....We Share, We Grow!

Articles

CRUCIAL LANGUAGE BASED CHALLENGES OF PATENT SEARCHING


IKS Category: Patent Searching & Analysis
IKS Article No: IKS_Article_03 September_25_2015
Compilation by: Nimesh Patel; Pritesh Gohel; Tejas Patel
Why This Article? We learned that in world fastest growing IP industry, to make effective patent searching, every patent searcher must know the crucial language based challenges. We accepted this challenge and we learned what the related areas are which pose challenges and what can be the remedies around those challenges.

In world fastest growing IP industry, to make effective patent searching, every patent searcher must know the crucial language based challenges and due to complex, rich and inconsistent nature with a huge vocabulary and numerous synonyms make English language more problematic.

Some major challenges of English language need to consider when performing searching are

  • Terminology:
    • Inconsistency in terminology as either dictionary synonyms was used, or abstract and creative ways were used to describe the same technology across patents. Terminology can take time to be accepted, and sometimes those popular in early years fail to be accepted in the end. New concepts are particularly difficult as often many different words are used until one or two terms become accepted.
    • Examples:
      • For automobile in between 1895 to 1900 in US and UK they used the words horseless carriage or horseless vehicle, and it's improved with time and in early 1900 most of the use word automobile for patent filing but still there is use of words horseless carriage or horseless vehicle in lower amount.
      • For bicycles during 1893 to 1900 for US and British patents they used velocipede and bicycles in equal proportion but in 1901 to 1910 velocipedes had dropped to lower. Cycle is also used to describe bicycle and also other cycles in chemical processes.
      • Much slower terminology changes occur in aviation industry. The most frequent words used in titles were flying machine, aeroplane, air ship and aerial vessel. Another thirty different words were used like aerodrome, aerial wheel, aerial craft, wind motor, aeronautic apparatus, aerodart, aeromotor and aerial top. The first use of the term aircraft was in 1914 and airplane appeared in 1919.
  • Patentese:
    • Intelligent patent attorneys used a different way to draft a patent which is not easily searchable through normal way.
    • Examples:
      • There are different views of person on the half / full water glass, pessimists would say that glass was half empty; optimists would say that it was half full and patent attorneys would describe it in his own way: an open ended cylinder horizontally bisected by liquid H2O. So we must have to keep Patentese in mind when we building a search query & perform searches.
      • Innovation AU patent (Circular Transportation Facilitation Device) by John Keogh get the patent on wheel by smartly drafting following abstract: A transportation facilitating device for facilitating transport of goods and persons. In particular, the device relates to a circular object which enables such goods and persons to be held above a surface and simultaneously moved with respect to the surface approximately parallel thereto.
      • Innovation US patent (Device for extracting elements from cavities, which uses a bag for extraction and an applicator) by Jorge also get the patent for delivering a baby from the womb, which main CPC class was A61B17/446 defined as obstetrical forceps without pivotal connections, e.g. using a vacuum.
  • Synonyms (Similar Technology Terms):
    • Synonyms are main challenges in searching because for the one particular technology terms there are many words can be used to describe that particular technology in a different ways.
    • Sometime many most common words are there for the synonyms you need to use in search, but at that time number of result are increase so you may remove some synonyms but you can't be sure that terms you removed are not used by any patentese, so there is a chances of missing the important result which is relevant.
    • Selection of the synonyms according to the particular technology is the main challenging task for patent searchers.
    • Examples:
      • Actual patent abstract from one of GB patent (Safety cap for a ball point pen) A cap for a ball point pen includes a truncated end which is pierced by an aperture. The aperture provides an air passage so that respiration can continue should the cap be swallowed and become lodged in the throat. Above abstract is clear but there is no mention about children which is very helpful. So it can be written as: A writing instrument with a pierced cap that fits over the writing end, so that a child who is choking can still breathe.
      • There are many other synonyms examples
        • Mobile such as telecommunication, telephone, cellular phone, cellphone, phone, PDA, personal digital assistant and so on. These different words give the unique results itself.
        • Pen such as Ball Point, Ballpoint, Writing Instrument, Writing Implement, Writing Device and Writing Equipment.
        • Drug such as medicine, prodrug, pro-drug, medicament, pharmaceutical formulation and pharmaceutical composition.
  • Spelling Mistakes (Typo Error or Typographical Ambiguities):
    • Keywords Spelling Errors
      • Different patent offices of worlds received patent data from different countries' inventors or assignees or attorneys and then they have entered the data of patent into centralize database, during that time period there are major chances of the spelling mistakes.
      • Sometime there is an extra space (one or two) in the spelling of word which makes them not searchable. This can be a very major problem for the searcher if that particular word of technology is misspelled in whole patent, so there are maximum chances of missing that relevant patent using the normal word searching.
      • Indexing errors emerge from OCR processing patent applications, because similar looking letters such as "e" versus "c", "I" versus "l or 1", "n" versus "ri", "o" versus "0", "q or g" versus "9" and "m" versus "rn" are likely to be misinterpreted.
      • With the advent of electronic patent application filing, the number of patent reexamination steps was reduced. As a consequence, the chance of undetected spelling errors increases.
      • Examples:
        • Separate and Seperate
        • Peice and Piece
        • Equiptment and Equipment
        • Spirt and Spirit
        • Dairy and Diary
        • Column and Colum
        • Hygiene and Hygene, Hygine, Hiygeine, Higeine, Hygeine
        • Vacuum and Vaccuum, Vaccum, Vacume
        • Stationery (office supplies) and Stationary (motionless)
        • Gold club and Golf club [On QWERTY keyboards the d and f are next to each other, so they can easily be transposed].
    • Assignee Spelling Errors
      • While performing assignee based search there is also major problems with the spelling mistakes and if we don't know the possible variations of spelling mistakes, definitely we will miss that patent easily while searching.
      • Companies often have branches in different countries, where each branch has its own company suffix, e.g., "Limited" (United States), "GmbH" (Germany), or "Kabushiki Kaisha" (Japan). Moreover, the usage of punctuation varies along company suffix abbreviations: "L.L.C." in contrast to "LLC", for example.
      • Examples:
        • Whirlpool Corporation and Whirpool Corporation
        • Esso and Exxon
        • Koninklijke Philips and Koninklijke Phillips
        • Koninklijke Philips and Koninkl Philips
        • Nippon Steel Corp and Nipon Steel Corp
        • Hitachi Ltd and Hittachi Ltd
        • Whetherford International and Weatherford International
        • Emulecks Corporation and Emulex Corporation.
  • Machine Translation:
    • Main problem of machine translation is when you are searching on other languages which are filed in the domestic languages like Japan, Korea and China without any INPADOC family members in English language. Sometimes main important technology terms are not translated. More seriously some terms are not exact equivalents of each other and nominal words have different meaning in English, French and Germen.
    • Sometimes machine translation is not done based on the particular technology domains but it's depending on the general translation rule which majorly affect the searching.
  • Noun & Verbs of Words:
    • In English there are many words with nouns and verbs both, and other meanings are not always obvious. Think about the possible meaning of nouns and verbs of words like oil, driver, dye, ground, train, can, rifle, down, back, store and cook. This may sound like a minor problem, but the different meanings can affect the results.
    • Again adding context such as a classification or other words usually removes the problem, but not always. Take to ship or shipping. Both words can either mean ships, or moving cargo on trains or lorries, when ships are not relevant.
  • Compound Nouns:
    • Patent attorneys write the compound noun as per their understanding so there is huge problem because there is multiple compound nouns used and spelt differently. Due to variations in compound noun number of results must be vary in search databases.
    • Examples:
      • Wheelchair rather than Wheel chair
      • Heatsink rather than Heat sink or Heat-sink
      • Sodium Phenylbutyrate rather than Sodium PBA or NaPBA
      • Ball point is more popular than Ballpoint
      • Triglycerides rather than Tri-glycerides
      • Semiconductor rather than Semi-conductor
    • Hyphens do not help, as they are treated as spaces by databases and hence mean separate words.
  • Words with Different Spellings:
    • Country wise patentese draft the patent with variation in spellings and which are often only known to those in the same industry.
    • Examples:
      • Aluminium, Alumininium, Alumina and Aluminum
      • Program and Programme
      • Vapour and Vapor
      • Sulfur and Sulphur
      • Color and Colour
      • Ageing and Aging
      • Diarrhoea and Diarrhea
      • Moisturiser and Moisturizer
  • Dialects or Linguistics:
    • Dialects also known as the Faux Amis or false friend which means words that appears to be the same in French and in English but have a different meaning in terms of phonology, grammar, vocabulary and by its use by a group of speakers. They are problem when building a search query, searching and analyzing in foreign language.
    • Example of Faux Amis
      • In French Location does not mean places to visit, it means rental,
      • Grape is bunch not a fruit for eating,
      • Confection refers to preparing of clothing not dessert or sweet and
      • Coin doesn't mean money, it means corner.
    • Many times we are rely on context but sometimes a search will give us a negative results as context is only apparent with the results, not in search terms.
  • Homophones (Different Spelling with Same Pronunciation):
    • Due to same pronunciation of English words it is easy to misspell them such as buy and by, cell and sell, right and write and so on.
    • Example: Words with same pronunciation but which are spelt quite differently
      • Steel and Steal
      • Cereal and Serial
      • Cede and Seed
      • Poor and Pour
      • Hole and Whole
      • Rain and Reign and Rein
    • A real problem is in the words inflammable and flammable, both means burning. As "in" prefix in English often means "not" as in invalid versus valid.
  • Homonyms (Same Word Different Meanings):
    • Sometime spelling of words is similar but their meanings are different from each. While some such meanings are related, often they are not.
    • Examples:
      • Light means either an absence of weight or illumination
      • Oil which can be petroleum or be put on food
      • Plane which can be an aircraft or a horizontal surface
      • Cycle which can be a bicycle or a stage in a process
      • Gas is used for petroleum (gaseous fuel) or vapour
      • Oil well obviously contains oil, but a well with no context is probably a water well
      • Christmas tree means oil pumping equipment rather than Christmas festival (25th December).
  • Contronyms (Words with Opposite Meanings):
    • Some words with two opposite meanings..
    • Examples:
      • Clip means to fasten or detach
      • Cleave means to adhere (join together) or to separate (divide)
      • Fast means quick or stuck or made stable
      • Quantum means significantly large or a minuscule part
      • Scan means to peruse or to glance
      • Skinned means covered with skin or with the skin removed.
  • Different Words:
    • There are many different words such as hood and bonnet, trunk and boot, parking lot and car park, rubber and eraser, cell phone and mobile phone, gasoline (or gas) and petrol, elevator and lift, sidewalk and pavement. Sometimes the searcher is not aware that there is a distinctly different word to be used can cause problems.
  • Syntax:
    • Arrangement of words in patent is a minor problem as few search exclusively for long phrases. Searching for three word phrases is generally sufficient. Adjacency operators looking for words within five words are more than sufficient in English to cover all order. Sometime you need to decide what order is most suitable to cover the important technology results.

"IntellectPeritus team undergoes with rigorous search training and we have unique search model which overcomes challenges of patent searching and effective search strategies leading us towards completion of 100 % patent searching. Our search model is combination of keywords based searching, patent classification searching, citations searching and assignee/inventors searching. In given limited time period, IntellectPeritus search model provides results claiming towards 100 % completion of search."