Términos relacionados con los centros de documentación:

  • Autenticidad. Garantía del carácter genuino y fidedigno de ciertos materiales digitales, es decir, de que son lo que se afirma de ellos, ya sea objeto original o en tanto que copia conforme y fiable de un original, realizada mediante procesos perfectamente documentados.
  • Preservación digital. Acciones destinadas a mantener la accesibilidad de los objetos digitales a largo plazo.
  • Certificación. Proceso de evaluación del grado en que un programa de preservación cumple con un conjunto de normas o prácticas mínimas previamente acordadas.
  • Derechos. Facultades o poderes legales que se tienen o ejercen con respecto a los materiales digitales, como son los derechos de autor, la privacidad, la confidencialidad y las restricciones nacionales o corporativas impuestas por motivos de seguridad.
  • Verificación. Acción de comprobar si un objeto digital, en un formato de fichero dado, está completo y cumple con la especificación de formato.
  • Identidad de objetos digitales. Característica que permite distinguir un objeto digital del resto, incluidas otras versiones o copias del mismo contenido.
  • Ingesta. Operación consistente en almacenar objetos digitales, y la documentación relacionada, de manera segura y ordenada.
  • Integridad de objetos digitales. Estado de los objetos que se encuentran completos y que no han sufrido corrupción o alteración alguna no autorizada ni documentada.

Cuando se habla de la Documentación digital, hay que tener en cuenta 3 propiedades esenciales:

  • Computabilidad: La información puede ser procesada o “calculada” por un ordenador.
  • Virtualidad: La información digital no está sujeta a las limitaciones propias de la analógica.
  • Capacidad: Ausencia de limitaciones prácticas en cuanto al volumen de información al que puede tener acceso en línea mediante interfaces unificados.

Sin embargo, la Documentación tradicional o analógica no comparte las mismas características. En primer lugar, no existe computabilidad alguna en la documentación tradicional puesto que no necesita ningún aparato para ser leída. Asimismo, la documentación tradicional no dispone de virtualidad dadas sus limitaciones, ya que sólo ofrece texto e imagen estática, pero no sonido ni imagen animada. Y, por último, comparándola con la Documentación digital, podría decirse que a diferencia de la Documentación digital, sí que tiene limitaciones prácticas en cuanto al volumen de información al que se puede tener acceso.

Teniendo en cuenta lo previamente mencionado, podríamos llegar a la conclusión de que la Documentación digital, dispone de ciertas ventajas con respecto a la documentación tradicional:

  • Permite que el editor se ajuste a los hábitos y gustos de los lectores más jóvenes.
  • Propicia el establecimiento de una relación muy directa con los clientes.
  • Dispone de información multimedia (texto, sonido e imagen).
  • Interactividad, es decir, ofrece una relación entre el lector y el sistema.
  • Es posible la recuperación de la información
  • La cantidad de información por unidad de volumen es infinitamente superior.
  • Tiene acceso a los títulos.
  • Virtualidad, su facilidad para ser reproducido, transmitido y almacenado.

Aún así, no todo son ventajas cuando se habla de la Documentación digital:

  • La existencia de un número de aparatos de lectura considerable, pero aún no masivos.
  • Dificultades en la distribución y la venta al detalle.
  • Poca información sobre los contenidos.
  • Precios no tan competitivos comparados con la edición analógica.
  • Problemas técnicos relacionados con las pre-instalaciones.
  • Mediatización (necesidad de un ordenador).
  • Poca ergonomía.

Además de todo lo mencionado y para finalizar, cabe señalar dos de las ventajas más notables de la edición tradicional: la confortabilidad y la practicabilidad ya que siempre se puede llevar y leer en cualquier parte.

Resumén realizado de las siguientes fuentes de información:

itList has been the first concept launched in regard to the concept of shared online bookmarks which dates back to April 1996. Within the next three years, online bookmark service became competitive, with venture-backed companies such as Backflip, Blink, Clip2, ClickMarks, HotLinks and others entering the market. However, it was Delicious, then called del.icio.us, who coined the term social bookmarking. A year later, as Delicious began to take off, other bookmark services were released, as an example it could be pointed out CiteULike.

According to Wikipedia, social bookmarking is “a method for internet users to share, organise, search and manage bookmarks of web resources”. To this bookmarks, some kind of descriptions are sent in the form of metadata, so that other users may understand the content of the resource. These descriptions may be free text comments, votes in favour of or against its quality or tags that become a folksonomy, also known as social tagging; that is, the proccess by which many users add metadata in the form od keywords to shared content.

The bookmark services save links to web pages that users want to remember or share and, moreover, these online services could be either public or private, in the sense that they can be shared only with specified people or groups. People who is allowed to see the bookmark services, can see them chronologicallly, by category or tags, or via a search engine.

Nowadays, as these services have been developing rapidly and, thanks to that, they have added extra features such as ratings and comments on bookmarks, the ability to import and export bookmarks from browsers, emailing of bookmarks, web annotation, and groups or other social network features.

 

 

These are the sources that have been used for this article:

This article is going to be based on the difficulties that Translendium has to translate from french to english.

First of all, we could point out the fact that the translator makes some mistakes with the meaning of words. For example, in the sentence, comme pour beaucoup d’histoires, the preposition “pour” is translated as “por” or “para”. The translation of the sentence above is the following: Como por|para muchas historias. In this case, the correct, translation would be “para”. Moreover, we could say that the translator does not know the exact meanning of the word “pour”, because the meanning would be “para” and the meanning of the word “por” is “par” in french.

Furthermore, we could stress the fact that the word order is incorrect. In the sentence, según las regiones es donde aplicada, it is obvious that the order of words has been altered. the correct sentence would be según las regiones donde es aplicada.

At the same time, we could mention the mistake that the translator makes with the number. For instance, existe varias versiones por el mundo. If the subject of the sentence is plural, the verb must go in plural. Thus, the correct sentence would be existen varias versiones por el mundo.

Apart from these mistakes, sometimes, the translator makes a wrong election of the word that it is going to use in the spanish translation. Is the case of the word “entourage” the translator chooses the word “círculo“. Nevertheless, the best word would be “entorno“.

Finally, we could say that, in general, the translator makes a mechanic translation. For example when it translates une histoire présente partout dans le monde into una historia presenta por todas partes en el mundo.

 

Sources:

Through this article, we are going to see the different problems that Translendium has to translate from english into spanish.

The first error we must point out is that the translator makes a mechanic translation, I mean it translates words one by one. As an example we could mention de title, Three little pigs translated as Tres pequeños cerdos. As everybody knows, the original title of this fairy tale is spanish is Los tres cerditos.

Then, we could stress that it seems that the translator has some difficulties with tenses, for instance, in this sentence: but the story is thought to be much older. The translator translates it in this way, pero se cree que la historia es|está mucho más vieja. It does not difference between es|está and obviously the correct form, in this case, is es. Another, problem with the tenses is the fact that the translator does not make a difference between pretérito perfecto simple and pretérito imperfecto. For example: And the little pig answered translated as y el pequeño cerdo contestaba. The correct form of saying it would be y el pequeño cerdo contestó, using the pretérito perfecto simple.

Moreover, there are some mistakes with the genre of words. For example, in the following sentence: El primer pequeño cerdo construye una casa de paja, pero lo derriba el lobo. If the indefinite article of the word “house” is femenine (una), then, we cannot put the direct object in masculine (lo).

 Furthermore, the translator makes some mistakes with the meanning of words. The sentence,  little pigs, little pigs, the translator does not look carefully the different meanings of the word little. On the one hand, it can mean “poco” but, on the other hand, it can mean “pequeño”. In this case, the translator makes the following translation: Poco cerdo, poco cerdo. An the correct translation would be cerdito, cerdito.

 Finally, we could mention the difficulty that the translator has to translate some words into spanish. For example, the word “huff” which is not translated. 

 

Sources:

 

 

Through this article we are going to define and discuss the following terms: machine translation, machine aided translation, multilingual content management and translation technology.

  • Machine Translation, sometimes referred by the abbreviation MT, is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. At its basic level, MT performs simple substitutions of words in one natural language words in another. Using corpus techniques, more complex translations may be attempted, allowing for better handling of differences in linguistic typology, phrase recognition, and translation of idioms, as well as the isolation of anomalies.
  • Translation technology is the technology in which the main action is the interpretation of the meaning of a text, and subsequent production of an equivalent text, also called a translation, that communicates the same message in another language. The text to be translated is called the source text, and the language it is to be translated into is called the target language; the final product is sometimes called the “target text”. 
  • Computer-aided translation, computer-assisted translation, or CAT is a form of translation wherein a human translator translates texts using computer software designed to support and facilitate the translation process.
  • Multilingual content management contains information, mostly in the form of more or less structured text documents, but potentially also including audio clips, video clips and images.

 

Information sources:

According to the FEMTI report, these are the main characteristics of a translation task:

  • ASSIMILATION: The ultimate purpose of the assimilation task (of which translation forms a part) is to monitor a large volume of texts produced by people outside the organization, in several languages.
  • DISSEMINATION: The most important purpose of Dissemination is to deliver to others a translation of documents produced inside the organization.
  • COMMUNICATION: The principal purpose of the communication task is to support multi-turn dialogues between people who speak different languages. The translation quality must be high enough for painless conversation, despite possible syntactically ill-formed input and idiosyncratic word and format usage.

 

Information sources:

The main idea of this article is explain some of the topics that the different Research Centres have developped.

First of all, we could talk about Corpus Linguistics. Corpus linguistics is the study of languages as expressed in samples (corpora) or “real world” text. This method represents a digestive approach to deriving a set of abstract rules by which a natural language is governed or else relates to another language.

Another topic that I consider important is the Speech Synthesis. Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called speech synthesizer, and can be implemented in software or hardware. Synthesized speech can be created by concatenating pieces of recorded speech that are stored in a database.

The third topic I want to focus on is Semantics. Semantics is the study of meaning in communication. In linguistics, it is the study of interpretation of signs as used by agents or communities within particular circumstances and contexts.

Finally, I would like to talk about Machine Translation. Machine translation sometimes referred to by the abbreviation MT, is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. At its basic level, MT performs simple techniques, more complex translations may be attemped, allowing for better handling of differences in linguistic typology, phrase recognition and translation of idioms, as well as the isolation of anomalies.

 

Information sources:

To continue with our work on HLT, it is important to focus on the most recent research topics mentioned in major sites on Human Language Technologies.

First of all, we could point out the German Research Centre of artificial Intelligence which elaborates these themes in research, development and commercial projects:

  • Exploiting – and automatically extending – ontologies for content processing.
  • Tighter integration of shallow and deep techniques in processing.
  • Enriching deep processing with statistical methods.
  • Combining language checking with structuring tools in document authoring document indexing for German and English.
  • Automatically associating recognized information with related information and thus building up collective knowledge.
  • Automatically structuring and visualizing extracted information.
  • Processing information encoded in multiple languages, among them Chinese and Japanese.

Likewise, the Austrian Research Institute for Artificial Intelligence develops linguistic resources and processes as well as application prototypes:

  • Typed unification-based grammar formalisms.
  • Development of a HPSG-based grammar for German.
  • Natural Language Generation.
  • Speech Synthesis.
  • Computational Morphology.
  • Natural language interfaces and advisory systems.
  • Concept-to-speech systems.

And, to finish with some of the most recent research topics on HLT, we could mention the National Centre for Language Technology. This research centre focuses its attention on these areas:

  • CALL Computer Assisted Language Learning: Integrating Cl/NPL/HLT Technology into CALL, CALL for Endangered Languages, CALL for Primary School Environments, CALL for Remedial Learners.
  • Corpus Linguistics: Collocation, Contrastive Computational Linguistics, Corpus-based Translation Studies.
  • Machine Translation and Translation Technology: Statistical and Rule-Based MT (SMT, RBMT), Example-Based MT (EBMT), Translation Memories (TMs), Boosting Existing MT Systems, Machine-Aided Translation (MAT), Computer-Aided Translation (CAT), Controlled Languages.
  • Treebank-Based Unification Grammar Acquisition: Automatic Feature-Structure Annotation Algorithms, Subcategorisation Frame Extraction, Wide-Coverage Robust Probabilistic Unification Grammar Acquisition, PCFG-Based LFG Approximation, HPSG Acquisition, Multilingual Treebank-Based Grammar Acquisition.
  • Semantics: Discourse Representation Theory, Linear-Logic Based Semantics, Computation of Logical Froms from Treebanks, Open-Domain Question Answering Systems.
  • Speech Technology: Speaker Characterisation Audio Classification, Retrieval and Coding, Human Computer Interfaces (HCIs).
  • Multilingual Information Retrieval/Extraction.
  • Language Evolution.

Information sources:

Nowadays, we could find several definitions for Human Language Technologies.

According to Wikipedia, Language Technology is often called Human Language Technology (HLT) or Human Language processing (HLP) and consists of computational linguistics and speech technology as its core but includes also many application oriented aspects of them.

Moreover, in the words of Hans Uszkoreit:

“Language Technology comprises computational methods, computer programs and electronic decives that are specialized for analyzing, producing or modifying texts and speech. These systems must be based on some knowledge of human language. Therefore human technology defines de engineering branch of computational linguistics”.

Eventually, to finish with this brief introduction to Human Language Technology, we could point out the short definition that Meraka Institute reaches:

“Human Language Technology (HLT) makes it easier for people to interact with machines. This can benefit a wide range of people – from illiterate farmers in remote villages who want to obtain relevant medical information over a cellphone, to scientists in state-of-the-art laboratories who want to focus on problem-solving with computers.”

Information sources:

      Next Page »