Registro completo de metadatos
Campo DC Valor Lengua/Idioma
dc.provenanceComisión de Investigaciones Científicas-
dc.contributorDe Giusti, Marisa Raquel-
dc.contributorLira, Ariel Jorge-
dc.contributorOviedo, Néstor-
dc.creatorDe Giusti, Marisa Raquel-
dc.creatorLira, Ariel Jorge-
dc.creatorOviedo, Néstor-
dc.date2011-05-17-
dc.date.accessioned2019-04-29T16:03:59Z-
dc.date.available2019-04-29T16:03:59Z-
dc.date.issued2011-05-17-
dc.identifierhttp://digital.cic.gba.gob.ar/handle/11746/2223-
dc.identifierEnlace externo-
dc.identifier.urihttp://rodna.bn.gov.ar:8080/jspui/handle/bnmm/308284-
dc.descriptionDigital repositories acting as resource aggregators typically face different challenges, roughly classified in three main categories: extraction, improvement and storage. The first category comprises issues related to dealing with different resource collection protocols: OAI-PMH, web-crawling, webservices, etc and their representation: XML, HTML, database tuples, unstructured documents, etc. The second category comprises information improvements based on controlled vocabularies, specific date formats, correction of malformed data, etc. Finally, the third category deals with the destination of downloaded resources: unification into a common database, sorting by certain criteria, etc. This paper proposes an ETL architecture for designing a software application that provides a comprehensive solution to challenges posed by a digital repository as resource aggregator. Design and implementation aspects considered during the development of this tool are described, focusing especially on architecture highlights.-
dc.formatapplication/pdf-
dc.formatapplication/pdf-
dc.languageeng-
dc.rightsinfo:eu-repo/semantics/openAccess-
dc.rightsAttribution 4.0 International (BY 4.0)-
dc.sourcereponame:CIC Digital (CICBA)-
dc.sourceinstname:Comisión de Investigaciones Científicas de la Provincia de Buenos Aires-
dc.sourceinstacron:CICBA-
dc.source.urihttp://digital.cic.gba.gob.ar/handle/11746/2223-
dc.source.uriEnlace externo-
dc.subjectCiencias de la Computación e Información-
dc.titleExtract, transform and load architecture for metadata collection-
dc.typeinfo:eu-repo/semantics/conferenceObject-
dc.typeinfo:eu-repo/semantics/publishedVersion-
dc.typeinfo:ar-repo/semantics/documentoDeConferencia-
Aparece en las colecciones: Comisión de Investigaciones Científicas de la Prov. de Buenos Aires

Ficheros en este ítem:
No hay ficheros asociados a este ítem.