

Search
You searched for the word(s):
Showing page 1 of 9 (81 total posts)
< 1 second(s)


I am finishing my list of conferences and seminars I am attending in the second half of the year 2015. Here is my list. Kulendayz 2015 – September 4th5th. Although I will have huge problems to get there on time, I would never like to miss it. I have one talk there. SQL Saturday #413 Denmark – September 17th19th. You can join me already on ...

This is a bit different post in the series about the data mining and machine learning algorithms. This time I am honored and humbled to announce that my fourth Pluralsight course is alive. This is the Data Mining Algorithms in SSAS, Excel, and R course. besides explaining the algorithms, I also show demos in different products. This gives you even ...

So we are back. PASS SQL Saturday is coming to Slovenia again on December 12th, 2015. Remember last two years? We had two great events. According to feedback, everybody was satisfied and happy. Let's make another outstanding event! How can you help?
First of all, these events are free for attendees. Of course, this is possible only because ...

Support vector machines are both, unsupervised and supervised learning models for classification and regression analysis (supervised) and for anomaly detection (unsupervised). Given a set of training examples, each marked as belonging to one of categories, an SVM training algorithm builds a model that assigns new examples into one category. An SVM ...

Principal component analysis (PCA) is a technique used to emphasize the majority of the variation and bring out strong patterns in a dataset. It is often used to make data easy to explore and visualize. It is closely connected to eigenvectors and eigenvalues.
A short definition of the algorithm: PCA uses an orthogonal transformation to convert ...


With the KMeans algorithm, each object is assigned to exactly one cluster. It is assigned to this cluster with a probability equal to 1.0. It is assigned to all other clusters with a probability equal to 0.0. This is hard clustering.
Instead of distance, you can use a probabilistic measure to determine cluster membership. For example, you can ...

Hierarchical clustering could be very useful because it is easy to see the optimal number of clusters in a dendrogram and because the dendrogram visualizes the clusters and the process of building of that clusters. However, hierarchical methods don’t scale well. Just imagine how cluttered a dendrogram would be if 10,000 cases would be shown on ...

Hierarchical clustering can use any kind of a distance; in fact, it does not need the original cases once the distance matrix is built. Therefore, you can use a distance that takes into account correlations, like the Mahalanobis distance (http://en.wikipedia.org/wiki/Mahalanobis_distance).
MS supports KMeans and ExpectationMaximization ...
1 ...



