THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |

Search

You searched for the word(s):
Showing page 1 of 8 (78 total posts) < 1 second(s)
  • PASS SQL Saturday #460 Slovenia 2015

    So we are back. PASS SQL Saturday is coming to Slovenia again on December 12th, 2015. Remember last two years? We had two great events. According to feedback, everybody was satisfied and happy. Let's make another outstanding event! How can you help? First of all, these events are free for attendees. Of course, this is possible only because ...
    Posted to Dejan Sarka (Weblog) by Dejan Sarka on June 27, 2015
  • Data Mining Algorithms – Support Vector Machines

    Support vector machines are both, unsupervised and supervised learning models for classification and regression analysis (supervised) and for anomaly detection (unsupervised). Given a set of training examples, each marked as belonging to one of categories, an SVM training algorithm builds a model that assigns new examples into one category. An SVM ...
    Posted to Dejan Sarka (Weblog) by Dejan Sarka on June 23, 2015
  • Data Mining Algorithms – Principal Component Analysis

    Principal component analysis (PCA) is a technique used to emphasize the majority of the variation and bring out strong patterns in a dataset. It is often used to make data easy to explore and visualize. It is closely connected to eigenvectors and eigenvalues. A short definition of the algorithm: PCA uses an orthogonal transformation to convert ...
    Posted to Dejan Sarka (Weblog) by Dejan Sarka on June 2, 2015
  • Data Mining Algorithms – EM Clustering

    With the K-Means algorithm, each object is assigned to exactly one cluster. It is assigned to this cluster with a probability equal to 1.0. It is assigned to all other clusters with a probability equal to 0.0. This is hard clustering. Instead of distance, you can use a probabilistic measure to determine cluster membership. For example, you can ...
    Posted to Dejan Sarka (Weblog) by Dejan Sarka on May 12, 2015
  • Data Mining Algorithms – K-Means Clustering

    Hierarchical clustering could be very useful because it is easy to see the optimal number of clusters in a dendrogram and because the dendrogram visualizes the clusters and the process of building of that clusters. However, hierarchical methods dont scale well. Just imagine how cluttered a dendrogram would be if 10,000 cases would be shown on ...
    Posted to Dejan Sarka (Weblog) by Dejan Sarka on April 17, 2015
  • re: Data Mining Algorithms – Hierarchical Clustering

    Hierarchical clustering can use any kind of a distance; in fact, it does not need the original cases once the distance matrix is built. Therefore, you can use a distance that takes into account correlations, like the Mahalanobis distance (http://en.wikipedia.org/wiki/Mahalanobis_distance). MS supports K-Means and Expectation-Maximization ...
    Posted to Dejan Sarka (Weblog) by Dejan Sarka on March 30, 2015
  • Data Mining Algorithms – Hierarchical Clustering

    Clustering is the process of grouping the data into classes or clusters so that objects within a cluster have high similarity in comparison to one another, but are very dissimilar to objects in other clusters. Dissimilarities are assessed based on the attribute values describing the objects. There are a large number of clustering algorithms. The ...
    Posted to Dejan Sarka (Weblog) by Dejan Sarka on March 28, 2015
  • re: Data Mining Algorithms – Association Rules

    Kevin, First of all, thank you for your kind comment. In SQL, you typically search for distinct combinations of items in the same transaction with either join or apply operator. I prefer apply. Bellow is an example that finds itemsets of size 1, 2, and 3. However, I would not recommend doing this in SQL - why would you reinvent the wheel? You ...
    Posted to Dejan Sarka (Weblog) by Dejan Sarka on March 25, 2015
  • Data Mining Algorithms – Association Rules

    The Association Rules algorithm is specifically designed for use in market basket analyses. This knowledge can additionally help in identifying cross-selling opportunities and in arranging attractive packages of products. This is the most popular algorithm used in web sales. You can even include additional discrete input variables and predict ...
    Posted to Dejan Sarka (Weblog) by Dejan Sarka on March 10, 2015
1 2 3 4 5 Next > ... Last »
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement