数据仓库与数据挖掘-文献订阅-浙江万里学院图书馆

文献订阅

全选清除本页清除全部题录导出

基于大数据管理下电网造价数据仓库的建立及应用研究

韦冬妮刘尚科

国网宁夏电力公司经济技术研究院

来源

维普期刊数据库

同方期刊数据库详细信息

在线全文

关键词： 大数据电网工程数据仓库应用研究

摘要： 基于电网基建工程的概算、结算等结构化和半结构化造价数据集合,分别梳理了变电工程和线路工程中技术经济指标、工程费用间的关系。根据网省公司开展技经业务的实际需求,依托大数据理论,构建了电网工程造价数据仓储架构。并研究了数据仓库在辅助评审、造价分析以及在线分析等技经业务中的应用,探讨了大数据仓储下的电网工程造价分析、数据控制和管理的新思路。

QAR数据仓库在Hive中的构建

冯兴杰吴稀钰赵杰贺阳房戍

中国民航大学计算机科学与技术学院天津300300

来源

维普期刊数据库中国学术期刊数据库

同方期刊数据库

万方期刊数据库更多详细信息

在线全文

关键词： Hive 快速存取记录器(QAR) 数据仓库数据处理 Hadoop

摘要： 分析QAR数据是一种非常有效的监控飞机状态的方法。但随着民航领域的快速发展,QAR数据的规模急剧增大,现有基于关系型数据库的QAR数据仓库不足以支撑海量数据下的存储与分析,导致海量的QAR数据因无法处理变成了信息垃圾。因此,针对现有数据仓库的不足,提出基于Hive的QAR数据仓库。通过对Hive特点及QAR数据结构分析,设计了基于Hive的QAR数据仓库的总体架构和存储结构。通过将现有数据仓库中的数据移植到基于Hive的QAR数据仓库,实现了对已有数据仓库的兼容。实验结果表明基于Hive的QAR数据仓库在面对QAR数据剧增的情况下,处理所需时间依然保持着线性增长。

AFC动态数据仓库应用系统中的变化数据捕获技术研究

李玉卿承晓斌张见张宁

北京全路通信信号研究设计院集团有限公司北京100070东南大学自动化学院南京210018东南大学智能运输系统研究中心轨道交通研究所南京210018

来源

维普期刊数据库

万方期刊数据库中国学术期刊数据库

同方期刊数据库更多详细信息

在线全文

关键词： AFC系统数据仓库基于日志的方法

摘要： 数据的实时性对战术决策的有效制定至关重要,数据捕获技术的优劣决定数据实时性的好坏。简要介绍数据仓库、动态数据仓库以及数据捕获技术。在此基础上对基于日志的变化数据捕获技术做出详细分析,并对其中涉及的日志读取、日志解析以及数据调度策略等问题进行探讨,为AFC动态数据仓库应用系统中变化数据捕获模块的成功构建奠定基础。

收益管理中基于大数据仓库的需求无约束估计:框架与挑战

郭鹏

贵阳学院经济管理学院贵州贵阳550005

来源

维普期刊数据库

万方期刊数据库中国学术期刊数据库

同方期刊数据库更多详细信息

在线全文

关键词： 收益管理需求预测无约束估计数据仓库大数据商务智能情感分析

摘要： 现有需求无约束估计方法均为基于公司内部数据仓库中所获需求信息而开发,在当前基于大数据分析的激烈竞争市场环境中,无法满足收益管理系统日益增长的实时需求预测和优化决策分析需要。为了实时、动态地同时获取并分析内部和外部数据资源中有关每位顾客的无约束需求数据,包括结构化和非结构化的信息,提出了以面向收益管理需求无约束估计为主题的大数据仓库框架,并据此讨论了无约束需求知识挖掘以及需求无约束估计商务智能分析工具开发应用过程中面临的各项挑战。

浅析数字档案信息资源的知识挖掘技术

赵淑媛

沈阳市第五人民医院

来源

维普期刊数据库

同方期刊数据库

万方期刊数据库中国学术期刊数据库更多详细信息

在线全文

关键词： 知识挖掘数字档案信息集中识别非平凡过程知识组织档案知识城建档案数据仓库市城建档案馆档案利用

摘要： 一、数字档案信息资源知识挖掘技术目的及对象知识挖掘技术是重要的知识组织技术,是主观知识组织的基本工具,是信息组织的高级形式与发展方向。关于知识挖掘技术概念众说纷纭,目前较权威的解释是由Usama ***等给出：＂知识挖掘是从数据集中识别出有效的、新颖的、潜在有用的,以及最终可理解的模式的非平凡过程。＂[1]1.档案知识挖掘技术目的。

The Georges Pompidou University Hospital Clinical Data Warehouse: A 8-years follow-up experience

Jannot, Anne-Sophie Zapletal, Eric Avillach, Paul Mamzer, Marie-France Burgun, Anita Degoulet, Patrice

Paris Descartes Fac Med Paris FranceINSERM UMR Informat Sci Support Personalized Med E22 113 Paris FranceGeorges Pompidou Univ Hosp Med Informat Biostat & Publ Hlth Dept 20 Rue Leblanc F-75015 Paris FranceINSERM EA 4569 Med Eth Dept Paris FranceHarvard Med Sch Dept Biomed Informat Boston MA USA

来源 ebsco(ASP/aph)

ScienceDirectJournal 详细信息

在线全文

关键词： Electronic health record Clinical datawarehouse Institutional ethics committee Clinical research Clinical epidemiology Health services research ELECTRONIC HEALTH RECORD TRANSLATIONAL RESEARCH SECONDARY USE INFORMATICS INFRASTRUCTURE

摘要： Background: When developed jointly with clinical information systems, clinical data warehouses (CDWs) facilitate the reuse of healthcare data and leverage clinical research. Objective: To describe both data access and use for clinical research, epidemiology and health service research of the "Hopital Europeen Georges Pompidou" (HEGP) CDW. Methods: The CDW has been developed since 2008 using an i2b2 platform. It was made available to health professionals and researchers in October 2010. Procedures to access data have been implemented and different access levels have been distinguished according to the nature of queries. Results: As of July 2016, the CDW contained the consolidated data of over 860,000 patients followed since the opening of the HEGP hospital in July 2000. These data correspond to more than 122 million clinicalitem values, 124 million biological item values, and 3.7 million free text reports. The ethics committee of the hospital evaluates all CDW projects that generate secondary data marts. Characteristics of the 74 research projects validated between January 2011 and December 2015 are described. Conclusion: The use of HEGP CDWs is a key facilitator for clinical research studies. It required however important methodological and organizational support efforts from a biomedical informatics department. (C) 2017 Elsevier B. V. All rights reserved.

Design Life-Cycle-Driven Approach for Data Warehouse Systems Configurability

Khouri, Selma Bellatreche, Ladjel

Natl Comp Sci Engn Sch ESI Algiers AlgeriaPoitiers Univ LIAS ISAE ENSMA F-86960 Poitiers France

来源

springerlink期刊详细信息

在线全文

springerlink期刊

关键词： Data warehouse system Configurability Design life-cycle CVL CONCEPTUAL-MODEL ETL

摘要： Many modern software systems are designed to be highly configurable. Configurability is the ability to build consistent systems from a common architecture through selecting and synthesizing provided design elements. Configurability offers high customizability and efficient reuse strategy. Configurability has not enjoyed the same popularity in data warehouse (DW) design comparing to other types of software. Nowadays, we are assisting to an explosion of new DW applications due to high-performance computing and emerging hardware. This continuous evolution context reveals a high degree of variability that needs to be managed and exploited. We propose in this paper a configurabilityaware approach for DW design, which allows designers to specify requirements defining suitable design options to generate a customized DW. To satisfy this objective, we need to perform the following three tasks: (i) a deep understanding of the DW design life-cycle analyzed by reviewing its evolutions, (ii) a formalization of each design phase and (iii) an identification of the interactions between phases. This analysis contributes in defining our approach containing: the configuration model which tailors the DW system to meet designers' requirements and the configuration process which produces the corresponding DW configuration. The approach is defined using the Base, Variability, Reso-lution (BVR) models defined using the Common Variability Language proposed by the Object Management Group for defining variability modeling and implemented using BVR Tool. A case study providing two DW configurations is proposed to show the effectiveness of our approach.

空管设备运维信息化中心平台建设探究

卢栩茵

民航中南空管局技术保障中心广东广州510400

来源

维普期刊数据库

万方期刊数据库

同方期刊数据库中国学术期刊数据库更多详细信息

在线全文

关键词： 设备运维信息化中心平台数据仓库

摘要： 随着民航业的发展,信息化系统建设的不断深入,空管设备运维信息化平台作为通信导航监视体系的业务支撑,发挥的作用日益显著。该文结合中南地区空管设备运维信息化平台的建设情况,对如何构建基于多信息系统的中心平台进行探究。

Geminivirus data warehouse: a database enriched with machine learning approaches

Silva, Jose Cleydson F. Carvalho, Thales F. M. Basso, Marcos F. Deguchi, Michihito Pereira, Welison A. Sobrinho, Roberto R. Vidigal, Pedro M. P. Brustolini, Otavio J. B. Silva, Fabyano F. Dal-Bianco, Maximiller Fontes, Renildes L. F. Santos, Anesia A. Zerbini, Francisco Murilo Cerqueira, Fabio R. Fontes, Elizabeth P. B.

Univ Fed Vicosa Dept Informat Vicosa MG BrazilUniv Fed Vicosa Natl Inst Sci & Technol Plant Pest Interact BIOAG Vicosa MG BrazilUniv Fed Fluminense Dept Engn Prod Petropolis RJ BrazilUniv Fed Vicosa Dept Biol Geral Vicosa MG BrazilUniv Fed Vicosa Dept Fitopatol Vicosa MG BrazilUniv Fed Vicosa Dept Bioquim & Biol Mol Vicosa MG BrazilUniv Fed Vicosa Nucleo Biomol Vicosa MG BrazilUniv Fed Vicosa Dept Zootecnia Vicosa MG BrazilUniv Fed Vicosa Dept Solos Vicosa MG Brazil

来源

springerlink期刊 ebsco(ASP/aph) BioMed Central期刊

详细信息

在线全文

关键词： Machine learning Random Forest Knowledge discovery Data mining Data Warehouse Geminivirus CURLY TOP VIRUS NUCLEOTIDE-SEQUENCE IRAN VIRUS CLASSIFICATION RECOMBINANT SATELLITES DIVERSITY ALIGNMENT SUGGESTS

摘要： Background: The Geminiviridae family encompasses a group of single-stranded DNA viruses with twinned and quasi-isometric virions, which infect a wide range of dicotyledonous and monocotyledonous plants and are responsible for significant economic losses worldwide. Geminiviruses are divided into nine genera, according to their insect vector, host range, genome organization, and phylogeny reconstruction. Using rolling-circle amplification approaches along with high-throughput sequencing technologies, thousands of full-length geminivirus and satellite genome sequences were amplified and have become available in public databases. As a consequence, many important challenges have emerged, namely, how to classify, store, and analyze massive datasets as well as how to extract information or new knowledge. Data mining approaches, mainly supported by machine learning (ML) techniques, are a natural means for high-throughput data analysis in the context of genomics, transcriptomics, proteomics, and metabolomics. Results: Here, we describe the development of a data warehouse enriched with ML approaches, designated geminivirus. org. We implemented search modules, bioinformatics tools, and ML methods to retrieve high precision information, demarcate species, and create classifiers for genera and open reading frames (ORFs) of geminivirus genomes. Conclusions: The use of data mining techniques such as ETL (Extract, Transform, Load) to feed our database, as well as algorithms based on machine learning for knowledge extraction, allowed us to obtain a database with quality data and suitable tools for bioinformatics analysis. The Geminivirus Data Warehouse (geminivirus. org) offers a simple and user-friendly environment for information retrieval and knowledge discovery related to geminiviruses.

AFC动态数据仓库应用系统中的查询竞争问题研究

李玉卿吴帆张见张宁

北京全路通信信号研究设计院集团有限公司北京100070东南大学自动化学院南京210018东南大学智能运输系统研究中心轨道交通研究所南京210018

来源

维普期刊数据库博看期刊公共累积版

同方期刊数据库

万方期刊数据库更多详细信息

在线全文

关键词： AFC系统动态数据仓库查询竞争问题动态多级缓存

摘要： 简要介绍传统数据仓库的基本技术特点以及动态数据仓库的概念,并对二者进行分析;对动态数据仓库中现有的各查询竞争问题解决方案进行分析和探讨,并结合具体应用需求,选用基于动态多级缓存的方法来缓解AFC动态数据仓库应用系统中的查询竞争问题,并对多级实时数据缓存方法实施过程中存在的短期某级缓存负载过大等问题进行探讨,并给出相应的解决方案。

共500页 << < 120 121 122 123 124 125 126 127 128 129 > >>

回到顶部

执行限定条件

课程文献中心更多>>