Emacs Artificial General Intelligence Algorithmic Game Theory: Prediction Markets (po polsku) Systemy Inteligentnych Agentów
|
DataMining.DataMining HistoryHide minor edits - Show changes to output May 21, 2009, at 02:54 PM
by - big data
Added line 2:
* [[http://radar.oreilly.com/2009/05/big-data-analytics-r-linked-data-ssd.html | Big Data: SSD's, R, and Linked Data Streams]] February 06, 2009, at 12:56 AM
by - DM blog
Changed lines 1-3 from:
to:
Featured links: * [[http://dataminingwarehousing.blogspot.com/ | Data Mining and Warehousing blog]] Changed line 47 from:
* [[http://www.liaad.up.pt/~ltorgo/DataMiningWithR/ | Data Mining with R: learning by case studies]] (by Luis Torgo), to:
* [[http://www.liaad.up.pt/~ltorgo/DataMiningWithR/ | Data Mining with R: learning by case studies]] (by Luis Torgo), nice tutorial with SQL interaction September 19, 2008, at 02:21 PM
by - link to a great tutorial
Added lines 45-55:
Other links for the class: * [[http://www.liaad.up.pt/~ltorgo/DataMiningWithR/ | Data Mining with R: learning by case studies]] (by Luis Torgo), seems excellent! * [[http://zoonek2.free.fr/UNIX/48_R/all.html | Statistics with R]], a very interesting but unpolished course on R * [[http://cc.oulu.fi/~jarioksa/opetus/metodi/eda.pdf | Introduction to R and Exploratory data analysis]] * [[http://www.math.csi.cuny.edu/Statistics/R/simpleR/index.html | Using R for Introductory Statistics]] (pre-draft of a published book) * [[http://cran.r-project.org/doc/contrib/usingR.pdf | Using R for data analysis and graphics]] (JH Maindonald) ** from [[http://www.biostat.wisc.edu/~kbroman/Rintro/ | Introduction to R]] (a collection of links by Karl W Broman) * [[http://www.cs.iastate.edu/~honavar/Papers/caragea-thesis.pdf | Learning classifiers from distributed, semantically heterogeneous, autonomous data sources]] Deleted lines 81-91:
Other links for the class: * [[http://www.cs.iastate.edu/~honavar/Papers/caragea-thesis.pdf | Learning classifiers from distributed, semantically heterogeneous, autonomous data sources]] * [[http://zoonek2.free.fr/UNIX/48_R/all.html | Statistics with R]], a very interesting course on R * [[(Wikipedia:)Exploratory data analysis]] * [[http://octave.sourceforge.net/doc/funref_statistics.html | Extra statistical functions for Octave]], among them @@boxplot@@. * [[http://cc.oulu.fi/~jarioksa/opetus/metodi/eda.pdf | Introduction to R and Exploratory data analysis]] * [[http://www.math.csi.cuny.edu/Statistics/R/simpleR/index.html | Using R for Introductory Statistics]] (pre-draft of a published book) * [[http://cran.r-project.org/doc/contrib/usingR.pdf | Using R for data analysis and graphics]] (JH Maindonald) ** from [[http://www.biostat.wisc.edu/~kbroman/Rintro/ | Introduction to R]] (a collection of links by Karl W Broman) Deleted line 4:
June 20, 2008, at 09:45 AM
by - time series link
Added line 30:
** [[http://www.scausa.com/DataMiningOnTimeSeries2.pdf | Data mining on time series: an illustration using fast-food restaurant franchise data]] June 20, 2008, at 03:48 AM
by - RDFowe bazy danych
Deleted line 30:
Changed lines 32-33 from:
to:
** [[http://www.ltg.ed.ac.uk/np/publications/ltg/papers/Byrne2006Tethering.pdf | Tethering Cultural Data with RDF]] (Kate Byrne) *** [[http://www.opencog.org/wiki/RelEx | RelEx]] is an English-language semantic relationship extractor, built on the Carnegie-Mellon link parser ** Bazy danych lub "interfejsy" do baz danych: *** [[http://openrdf.org/ | Sesame]] is an open source framework for storage, inferencing and querying of RDF data. *** [[http://mondrian.pentaho.org/ | Mondrian]] is an OLAP server written in Java. It enables you to interactively analyze very large datasets stored in SQL databases without writing SQL. *** [[the Exist XML database -> http://exist.sourceforge.net/]] *** [[the RDF/SPARQL/XML part of the OpenLink Virtuoso system -> http://sourceforge.net/projects/virtuoso/]] *** [[the Jena/ARQ combo -> http://jena.sourceforge.net/ARQ/]] *** [[http://www.kobrix.com/hgdb.jsp | HyperGraphDB]] *** [[http://www.intellidimension.com/ | RDF Gateway]] (komercyjne) *** [[http://4suite.org/index.xhtml | 4Suite: an open-source platform for XML and RDF processing]] *** [[http://rx4rdf.liminalzone.org/ | Rx4RDF]] June 20, 2008, at 02:51 AM
by - RDF and graph databases
Changed lines 32-33 from:
to:
** [[http://www.dcc.uchile.cl/~cgutierr/papers/eswc05.pdf | Querying RDF Data from a Graph Database Perspective]] (Renzo Angles and Claudio Gutierrez), [[http://www.ciw.cl/material/irw-2005/2005-irw-gutierrez.pdf | Querying from a Graph Database Perspective: the case of RDF]] -- prezentacja June 06, 2008, at 07:57 AM
by - projection pursuit PCA
Added lines 25-27:
** [[http://zoonek2.free.fr/UNIX/48_R/05.html | Statistics with R: Factorial methods: Around Principal Component Analysis (PCA)]] ** [[http://www.r-project.org/useR-2006/Slides/Fritz.pdf | PCA by Projection Pursuit. The Package pcaPP]] Heinrich Fritz ** [[http://cran.r-project.org/doc/Rnews/Rnews_2003-3.pdf | Dimensional Reduction for Data Mapping]] (Jonathan Edwards and Paul Oman) R News, 3, 2003 Changed line 18 from:
** [[http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf | A Practical Guide to Support Vector to:
** [[http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf | A Practical Guide to Support Vector Classification]] May 30, 2008, at 04:19 PM
by - SVM guide
Added line 18:
** [[http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf | A Practical Guide to Support Vector Classification]] May 23, 2008, at 03:07 AM
by - rpart
Added line 18:
** [[http://www.mayo.edu/hsr/techrpt/61.pdf | An Introduction to Recursive Partitioning Using the RPART Routines]] ([[http://ndc.mayo.edu/mayo/research/biostat/upload/rpartmini.pdf | wersja skrócona]] -- bez opisu teoretycznego) May 21, 2008, at 06:45 PM
by - PCA zadanie
Changed lines 22-23 from:
* [[Zadanie7]] (szeregi czasowe) to:
* [[Zadanie7]] (wizualizacja danych) * [[Zadanie8]] (szeregi czasowe) Changed line 25 from:
* [[ to:
* [[Zadanie9]] ("niestandardowe" bazy danych) May 21, 2008, at 06:31 PM
by - plan zadan
Changed lines 6-8 from:
* [[Zadanie1]] to:
* [[Zadanie1]] (eksploracyjna analiza danych) * [[Zadanie2]] (rozkład normalny i symulacja/generowanie) * [[Zadanie3]] (wielojądrowe SVMy) Changed line 13 from:
* [[Zadanie4]] to:
* [[Zadanie4]] (sieci Bayesowskie i reguły asocjacyjne) Changed line 16 from:
* [[Zadanie5]] to:
* [[Zadanie5]] (klasyfikacja, drzewa decyzyjne) Changed lines 20-27 from:
to:
* [[Zadanie6]] (grupowanie, metody hierarchiczne, ocena ilości skupień) ** [[http://zoonek2.free.fr/UNIX/48_R/06.html | Statistics with R: Clustering]] * [[Zadanie7]] (szeregi czasowe) ** [[http://zoonek2.free.fr/UNIX/48_R/15.html | Statistics with R: Time series]] * [[Zadanie8]] ("niestandardowe" bazy danych) ** [[http://mondrian.pentaho.org/ | Mondrian]] is an OLAP server written in Java. It enables you to interactively analyze very large datasets stored in SQL databases without writing SQL. May 16, 2008, at 07:38 AM
by - zadanie 5 -- klasyfikacja
Changed lines 16-20 from:
to:
* [[Zadanie5]] ** [[http://zoonek2.free.fr/UNIX/48_R/12.html | Statistics with R: Rozdział 12]] Using regression in a classification problem, Nearest Neighbours, Naive Bayes classifier ** [[http://cran.r-project.org/doc/Rnews/Rnews_2002-3.pdf | Classification and regression by randomForest]] R News, 3, 2002 ** [[http://stats.math.uni-augsburg.de/Klimt/features.html | KLIMT Making trees interactive]] (Klassification - Interactive Methods for Trees) April 25, 2008, at 12:35 PM
by - \
Changed line 24 from:
# Tematy z książki "The Handbook of Data Mining" to:
# Tematy z książki "The Handbook of Data Mining". Można wybrać jeden rozdział lub kilka powiązanych rozdziałów. [wszystkie wolne] Changed line 35 from:
23 Mining Science and Engineering Data 549 (Chandrika Kamath) to:
## 23 Mining Science and Engineering Data 549 (Chandrika Kamath) April 25, 2008, at 12:32 PM
by - referaty handbook
Deleted lines 16-17:
Changed lines 24-43 from:
to:
# Tematy z książki "The Handbook of Data Mining" ## 13 Distributed Data Mining 341 (Byung-Hoon Park and Hillol Kargupta) ## II: MANAGEMENT OF DATA MINING 14 Data Collection, Preparation, Quality, and Visualization 365 (Dorian Pyle) ## 15 Data Storage and Management 393 (Tong (Teresa) Wu and Xiangyang (Sean) Li) ## 16 Feature Extraction, Selection, and Construction 409 (Huan Liu, Lei Yu, and Hiroshi Motoda) ## 17 Performance Analysis and Evaluation 425 (Sholom M. Weiss and Tong Zhang) ## 18 Security and Privacy 441 (Chris Clifton) ## 19 Emerging Standards and Interfaces 453 (Robert Grossman, Mark Hornick, and Gregor Meyer) ## III: APPLICATIONS OF DATA MINING 20 Mining Human Performance Data 463 (David A. Nembhard) ## 21 Mining Text Data 481 (Ronen Feldman) ## 22 Mining Geospatial Data 519 (Shashi Shekhar and Ranga Raju Vatsavai) 23 Mining Science and Engineering Data 549 (Chandrika Kamath) ## 24 Mining Data in Bioinformatics 573 (Mohammed J. Zaki) ## 25 Mining Customer Relationship Management (CRM) Data 597 (Robert Cooley) ## 26 Mining Computer and Network Security Data 617 (Nong Ye) ## 27 Mining Image Data 637 (Chabane Djeraba and Gregory Fernandez) ## 28 Mining Manufacturing Quality Data 657 (Murat C. Testik and George C. Runger) April 25, 2008, at 07:07 AM
by - BNT, zadanie4
Changed lines 13-18 from:
to:
* [[Zadanie4]] ** [[http://www.autonlab.org/tutorials/shortbayes.html | Short Overview of Bayes Nets]], [[Inference in Bayesian Networks -> http://www.autonlab.org/tutorials/bayesinf.html]] [[http://www.autonlab.org/tutorials/bayesstruct.html | Learning Bayesian Networks]]: Tutorial Slides by Andrew Moore ** [[http://www.cs.ubc.ca/~murphyk/Software/BNT/bnt.html | Bayes Net Toolbox for Matlab]] Written by Kevin Murphy, 1997--2002: [[http://www.cs.ubc.ca/~murphyk/Bayes/bnintro.html | A Brief Introduction to Graphical Models and Bayesian Networks]], [[http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html | How to use the toolbox]] ** April 11, 2008, at 10:27 PM
by - referaty początek
Changed lines 14-21 from:
to:
Propozycje referatów (w trakcie opracowywania): # Komunikacja z bazą danych z poziomu środowiska R. [zajęty] # Automatyczne generowanie raportów w środowisku R. [wolny] # Tematy z [[http://www.cs.iastate.edu/~honavar/Papers/caragea-thesis.pdf | Learning classifiers from distributed, semantically heterogeneous, autonomous data sources]]: [wszystkie wolne] ## Rozdział trzeci: "Learning classifiers from distributed data", ## Rozdział czwarty: "Learning classifiers from sementically heterogeneous data", ## W razie zainteresowania możemy zrobić więcej. April 11, 2008, at 09:39 PM
by - learning from distributed sources
Added line 16:
* [[http://www.cs.iastate.edu/~honavar/Papers/caragea-thesis.pdf | Learning classifiers from distributed, semantically heterogeneous, autonomous data sources]] Added lines 25-26:
WARNING: Old stuff below. April 11, 2008, at 04:30 AM
by - SVMs, learning with heterogeneous data
Changed lines 9-14 from:
to:
** [[http://www.autonlab.org/tutorials/svm15.pdf | Support Vector Machines, Tutorial Slides by Andrew Moore]], [[http://www.support-vector.net/icml-tutorial.pdf | Support Vector and Kernel Machines]] ** [[http://video.google.pl/videoplay?docid=4867582015325197740 | Sparse and large-scale learning with heterogeneous data]] ** [[http://www.potschi.de/svmtut/svmtut.html | SVM-Tutorial using R (e1071-package)]] ** [[http://www.shogun-toolbox.org/ | Shogun - A Large Scale Machine Learning Toolbox]] Added line 16:
* [[http://zoonek2.free.fr/UNIX/48_R/all.html | Statistics with R]], a very interesting course on R April 04, 2008, at 08:04 AM
by - HD projekt
Added line 5:
* [[Projekt, dr Leszek Grocholski -> Attach:projektHD.doc]] March 28, 2008, at 01:21 AM
by - zadanie 2
Changed lines 1-8 from:
to:
Linki na pracownię: * [[Attach:zadania.pdf]] * [[Attach:introOctaveR.pdf | Wprowadzenie do języków Octave i R]] * [[Zadanie1]] * [[Zadanie2]] Other links for the class: Deleted lines 35-40:
Linki na pracownię: * [[Attach:zadania.pdf]] * [[Attach:introOctaveR.pdf | Wprowadzenie do języków Octave i R]] * [[Zadanie1]] March 20, 2008, at 06:00 AM
by - pracownia linki
Changed lines 30-34 from:
to:
Linki na pracownię: * [[Attach:zadania.pdf]] * [[Attach:introOctaveR.pdf | Wprowadzenie do języków Octave i R]] * [[Zadanie1]] March 19, 2008, at 11:39 PM
by - R EDA link
Added line 4:
* [[http://cc.oulu.fi/~jarioksa/opetus/metodi/eda.pdf | Introduction to R and Exploratory data analysis]] March 19, 2008, at 02:03 AM
by - R links
Changed lines 4-7 from:
to:
* [[http://www.math.csi.cuny.edu/Statistics/R/simpleR/index.html | Using R for Introductory Statistics]] (pre-draft of a published book) * [[http://cran.r-project.org/doc/contrib/usingR.pdf | Using R for data analysis and graphics]] (JH Maindonald) ** from [[http://www.biostat.wisc.edu/~kbroman/Rintro/ | Introduction to R]] (a collection of links by Karl W Broman) Changed lines 27-28 from:
to:
* [[http://www.wired.com/techbiz/media/magazine/16-03/mf_netflix?currentPage=1 | Wired: This Psychologist Might Outsmart the Math Brains Competing for the Netflix Prize]] March 14, 2008, at 04:33 AM
by - extra stat for Octave
Changed lines 3-4 from:
to:
* [[http://octave.sourceforge.net/doc/funref_statistics.html | Extra statistical functions for Octave]], among them @@boxplot@@. March 14, 2008, at 04:27 AM
by - exploratory data analysis
Added lines 1-4:
Links for the class: * [[(Wikipedia:)Exploratory data analysis]] January 26, 2008, at 08:56 PM
by - places
Added line 10:
Added lines 14-16:
Places: * [[http://research.google.com/ | Google Research]] January 25, 2008, at 07:41 PM
by - google datastore
Changed lines 14-15 from:
* [[Wikipedia:Petabyte#Petabytes_in_use]] to:
* [[Wikipedia:Petabyte#Petabytes_in_use]] * [[http://arstechnica.com/news.ars/post/20080122-rumors-suggest-google-is-set-to-open-scientific-data-store.html | Arstechnica: Rumors suggest Google is set to open scientific data store]] January 24, 2008, at 10:58 PM
by - petabytes
Added lines 12-14:
Amusements: * [[Wikipedia:Petabyte#Petabytes_in_use]] December 27, 2007, at 09:36 PM
by - learning library
Added lines 9-11:
Software: * [[http://hunch.net/~vw/ | Vowpal Wabbit (Fast Online Learning)]] December 21, 2007, at 11:15 PM
by - FilterBoost
Changed lines 7-8 from:
* [[http://video.google.com/videoplay?docid=4867582015325197740 | Sparse and large-scale learning with heterogeneous data]], Google Tech to:
* [[http://video.google.com/videoplay?docid=4867582015325197740 | Sparse and large-scale learning with heterogeneous data]], Google Tech Talk * [[http://www.cs.cmu.edu/~jkbradle/ | FilterBoost: Regression and Classification on Large Datasets]], Joseph K. Bradley and Robert E. Schapire October 03, 2007, at 01:39 AM
by - Sparse and large-scale learning with heterogeneous data
Changed lines 6-7 from:
* [[http://nars.wang.googlepages.com/wang.preference.pdf | Recommendation Based on Personal Preference]], Pei Wang. (Asking a database questions like: "give me five best examples of fast and cheap notebook", "find cheap ticket for a flight that leaves C around 9AM and arives at D as early as possible".) to:
* [[http://nars.wang.googlepages.com/wang.preference.pdf | Recommendation Based on Personal Preference]], Pei Wang. (Asking a database questions like: "give me five best examples of fast and cheap notebook", "find cheap ticket for a flight that leaves C around 9AM and arives at D as early as possible".) * [[http://video.google.com/videoplay?docid=4867582015325197740 | Sparse and large-scale learning with heterogeneous data]], Google Tech Talk September 15, 2007, at 01:39 PM
by - Pei Wang preferences
Added line 6:
* [[http://nars.wang.googlepages.com/wang.preference.pdf | Recommendation Based on Personal Preference]], Pei Wang. (Asking a database questions like: "give me five best examples of fast and cheap notebook", "find cheap ticket for a flight that leaves C around 9AM and arives at D as early as possible".) September 08, 2007, at 10:20 PM
by - IR, picture analysis
Added lines 1-5:
Some random links for now: * [[http://www.ii.uni.wroc.pl/%7Etju/Wyszukiwanie07/wyszukiwanie07.html | Information Retrieval (Searching the Web)]], Tomasz Jurdzinski course and links from there * [[http://portal.acm.org/citation.cfm?id=1133508 | Picture languages in machine understanding of medical visualization]], Marek R. Ogiela, Ryszard Tadeusiewicz * [[http://library.epfl.ch/theses/?nr=3729 | Learning the structure of image collections with latent aspect models]], Florent Monay |