Functional Programming
Type Inference
Toss
- (incorporates former Speagram)
Emacs
Kurs Pascala
Artificial General Intelligence
AI:
Algorithmic Game Theory: Prediction Markets (po polsku)
Programming in Java
kurs pracy w systemie Linux
Evolutionary Algorithms
Animation
Data Stores and Data Mining
Language Understanding
Systemy Inteligentnych Agentów
Przetwarzanie Języka Naturalnego
Programowanie Funkcjonalne
PmWiki
pmwiki.org
add user
edit SideBar
|
DataMining.DataMining History
Hide minor edits - Show changes to markup
Added line 2:
- Big Data: SSD’s, R, and Linked Data Streams
Changed lines 1-3 from:
to:
Featured links:
- Data Mining and Warehousing blog
Changed line 47 from:
- Data Mining with R: learning by case studies (by Luis Torgo), seems excellent!
to:
- Data Mining with R: learning by case studies (by Luis Torgo), nice tutorial with SQL interaction
Added lines 45-55:
Other links for the class:
- Data Mining with R: learning by case studies (by Luis Torgo), seems excellent!
- Statistics with R, a very interesting but unpolished course on R
- Introduction to R and Exploratory data analysis
- Using R for Introductory Statistics (pre-draft of a published book)
- Using R for data analysis and graphics (JH Maindonald)
- from Introduction to R (a collection of links by Karl W Broman)
- Learning classifiers from distributed, semantically heterogeneous, autonomous data sources
Deleted lines 81-91:
Other links for the class:
- Learning classifiers from distributed, semantically heterogeneous, autonomous data sources
- Statistics with R, a very interesting course on R
- Exploratory data analysis
- Extra statistical functions for Octave, among them
boxplot .
- Introduction to R and Exploratory data analysis
- Using R for Introductory Statistics (pre-draft of a published book)
- Using R for data analysis and graphics (JH Maindonald)
- from Introduction to R (a collection of links by Karl W Broman)
Added line 30:
- Data mining on time series: an illustration using fast-food restaurant franchise data
Deleted line 30:
- Mondrian is an OLAP server written in Java. It enables you to interactively analyze very large datasets stored in SQL databases without writing SQL.
Changed lines 32-33 from:
to:
- Tethering Cultural Data with RDF (Kate Byrne)
- RelEx is an English-language semantic relationship extractor, built on the Carnegie-Mellon link parser
- Bazy danych lub “interfejsy” do baz danych:
- Sesame is an open source framework for storage, inferencing and querying of RDF data.
- Mondrian is an OLAP server written in Java. It enables you to interactively analyze very large datasets stored in SQL databases without writing SQL.
- the Exist XML database
- the RDF/SPARQL/XML part of the OpenLink Virtuoso system
- the Jena/ARQ combo
- HyperGraphDB
- RDF Gateway (komercyjne)
- 4Suite: an open-source platform for XML and RDF processing
- Rx4RDF
Changed lines 32-33 from:
to:
- Querying RDF Data from a Graph Database Perspective (Renzo Angles and Claudio Gutierrez), Querying from a Graph Database Perspective: the case of RDF — prezentacja
Added lines 25-27:
- Statistics with R: Factorial methods: Around Principal Component Analysis (PCA)
- PCA by Projection Pursuit. The Package pcaPP Heinrich Fritz
- Dimensional Reduction for Data Mapping (Jonathan Edwards and Paul Oman) R News, 3, 2003
Changed line 18 from:
- A Practical Guide to Support Vector Classification
to:
- A Practical Guide to Support Vector Classification
Added line 18:
- A Practical Guide to Support Vector Classification
Added line 18:
- An Introduction to Recursive Partitioning Using the RPART Routines (wersja skrócona — bez opisu teoretycznego)
Changed lines 22-23 from:
to:
Changed line 25 from:
to:
Changed lines 6-8 from:
to:
- Zadanie1 (eksploracyjna analiza danych)
- Zadanie2 (rozkład normalny i symulacja/generowanie)
- Zadanie3 (wielojądrowe SVMy)
Changed line 13 from:
to:
- Zadanie4 (sieci Bayesowskie i reguły asocjacyjne)
Changed line 16 from:
to:
- Zadanie5 (klasyfikacja, drzewa decyzyjne)
Changed lines 20-27 from:
to:
- Zadanie6 (grupowanie, metody hierarchiczne, ocena ilości skupień)
- Statistics with R: Clustering
- Zadanie7 (szeregi czasowe)
- Statistics with R: Time series
- Zadanie8 (“niestandardowe” bazy danych)
- Mondrian is an OLAP server written in Java. It enables you to interactively analyze very large datasets stored in SQL databases without writing SQL.
Changed lines 16-20 from:
to:
- Zadanie5
- Statistics with R: Rozdział 12 Using regression in a classification problem, Nearest Neighbours, Naive Bayes classifier
- Classification and regression by randomForest R News, 3, 2002
- KLIMT Making trees interactive (Klassification - Interactive Methods for Trees)
Changed line 24 from:
- Tematy z książki “The Handbook of Data Mining”
to:
- Tematy z książki “The Handbook of Data Mining”. Można wybrać jeden rozdział lub kilka powiązanych rozdziałów. [wszystkie wolne]
Changed line 35 from:
23 Mining Science and Engineering Data 549 (Chandrika Kamath)
to:
- 23 Mining Science and Engineering Data 549 (Chandrika Kamath)
Deleted lines 16-17:
Changed lines 24-43 from:
to:
- Tematy z książki “The Handbook of Data Mining”
- 13 Distributed Data Mining 341 (Byung-Hoon Park and Hillol Kargupta)
- II: MANAGEMENT OF DATA MINING 14 Data Collection, Preparation, Quality, and Visualization 365 (Dorian Pyle)
- 15 Data Storage and Management 393 (Tong (Teresa) Wu and Xiangyang (Sean) Li)
- 16 Feature Extraction, Selection, and Construction 409 (Huan Liu, Lei Yu, and Hiroshi Motoda)
- 17 Performance Analysis and Evaluation 425 (Sholom M. Weiss and Tong Zhang)
- 18 Security and Privacy 441 (Chris Clifton)
- 19 Emerging Standards and Interfaces 453 (Robert Grossman, Mark Hornick, and Gregor Meyer)
- III: APPLICATIONS OF DATA MINING 20 Mining Human Performance Data 463 (David A. Nembhard)
- 21 Mining Text Data 481 (Ronen Feldman)
- 22 Mining Geospatial Data 519 (Shashi Shekhar and Ranga Raju Vatsavai)
23 Mining Science and Engineering Data 549 (Chandrika Kamath)
- 24 Mining Data in Bioinformatics 573 (Mohammed J. Zaki)
- 25 Mining Customer Relationship Management (CRM) Data 597 (Robert Cooley)
- 26 Mining Computer and Network Security Data 617 (Nong Ye)
- 27 Mining Image Data 637 (Chabane Djeraba and Gregory Fernandez)
- 28 Mining Manufacturing Quality Data 657 (Murat C. Testik and George C. Runger)
Changed lines 13-18 from:
to:
- Zadanie4
- Short Overview of Bayes Nets, Inference in Bayesian Networks Learning Bayesian Networks: Tutorial Slides by Andrew Moore
- Bayes Net Toolbox for Matlab Written by Kevin Murphy, 1997—2002: A Brief Introduction to Graphical Models and Bayesian Networks, How to use the toolbox
-
Changed lines 14-21 from:
to:
Propozycje referatów (w trakcie opracowywania):
- Komunikacja z bazą danych z poziomu środowiska R. [zajęty]
- Automatyczne generowanie raportów w środowisku R. [wolny]
- Tematy z Learning classifiers from distributed, semantically heterogeneous, autonomous data sources: [wszystkie wolne]
- Rozdział trzeci: “Learning classifiers from distributed data”,
- Rozdział czwarty: “Learning classifiers from sementically heterogeneous data”,
- W razie zainteresowania możemy zrobić więcej.
Added line 16:
- Learning classifiers from distributed, semantically heterogeneous, autonomous data sources
Added lines 25-26:
WARNING: Old stuff below.
Changed lines 9-14 from:
to:
- Support Vector Machines, Tutorial Slides by Andrew Moore, Support Vector and Kernel Machines
- Sparse and large-scale learning with heterogeneous data
- SVM-Tutorial using R (e1071-package)
- Shogun - A Large Scale Machine Learning Toolbox
Added line 16:
- Statistics with R, a very interesting course on R
Changed lines 7-8 from:
to:
Changed lines 1-8 from:
to:
Linki na pracownię:
Other links for the class:
Deleted lines 35-40:
Changed lines 30-34 from:
to:
Added line 4:
- Introduction to R and Exploratory data analysis
Changed lines 4-7 from:
to:
- Using R for Introductory Statistics (pre-draft of a published book)
- Using R for data analysis and graphics (JH Maindonald)
- from Introduction to R (a collection of links by Karl W Broman)
Changed lines 27-28 from:
to:
- Wired: This Psychologist Might Outsmart the Math Brains Competing for the Netflix Prize
Changed lines 3-4 from:
to:
- Extra statistical functions for Octave, among them
boxplot .
Added line 10:
Added lines 14-16:
Changed lines 14-15 from:
to:
Added lines 9-11:
Software:
- Vowpal Wabbit (Fast Online Learning)
Changed lines 7-8 from:
- Sparse and large-scale learning with heterogeneous data, Google Tech Talk
to:
- Sparse and large-scale learning with heterogeneous data, Google Tech Talk
- FilterBoost: Regression and Classification on Large Datasets, Joseph K. Bradley and Robert E. Schapire
Changed lines 6-7 from:
- Recommendation Based on Personal Preference, Pei Wang. (Asking a database questions like: “give me five best examples of fast and cheap notebook”, “find cheap ticket for a flight that leaves C around 9AM and arives at D as early as possible”.)
to:
- Recommendation Based on Personal Preference, Pei Wang. (Asking a database questions like: “give me five best examples of fast and cheap notebook”, “find cheap ticket for a flight that leaves C around 9AM and arives at D as early as possible”.)
- Sparse and large-scale learning with heterogeneous data, Google Tech Talk
Added line 6:
- Recommendation Based on Personal Preference, Pei Wang. (Asking a database questions like: “give me five best examples of fast and cheap notebook”, “find cheap ticket for a flight that leaves C around 9AM and arives at D as early as possible”.)
Added lines 1-5:
Some random links for now:
- Information Retrieval (Searching the Web), Tomasz Jurdzinski course and links from there
- Picture languages in machine understanding of medical visualization, Marek R. Ogiela, Ryszard Tadeusiewicz
- Learning the structure of image collections with latent aspect models, Florent Monay
|