ME-MO: 2014

Wednesday, July 2, 2014

Emulsion的使用步骤

最近很喜欢兰芝的产品。。气味控 <嗅嗅>

去target的时候忍不住买了balance emulsion。然后回来不知道怎么用。

然后就发现了这个妹子的博文。

留着借鉴:)

Monday, June 30, 2014

Path Sum

Given a binary tree and a sum, determine if the tree has a root-to-leaf path such that adding up all the values along the path equals the given sum.

For example:
Given the below binary tree and sum = 22,

              5
             / \
            4   8
           /   / \
          11  13  4
         /  \      \
        7    2      1

return true, as there exist a root-to-leaf path 5->4->11->2 which sum is 22.

Something worth noticing in the problem:

a. It is a binary tree

b. only if there is a "root-to-leaf" path , is it a valid path

              5
             / \    given this tree and sum = 5, it is not a valid path since 
            4   8   it doesn't get down to the leaf.

It is clearly a recursive problem. Pass the current required sum to certain root until the leaf and see whether it is a valid path

# Definition for a  binary tree node
class TreeNode:
    def __init__(self, x):
        self.val = x
        self.left = None
        self.right = None

class Solution:
    # @param root, a tree node
    # @param sum, an integer
    # @return a boolean

    def hasPathSum(self, root, sum):
        if root is None:
             return False
        if root.val == sum and root.left is None and root.right is None:
            return True
        if root.val != sum and root.left is None and root.right is None:
            return False
        else:
            return self.hasPathSum(root.left,sum-root.val) or self.hasPathSum(root.right, sum-root.val)

Sunday, June 29, 2014

Given a binary tree, determine if it is height-balanced. For this problem, a height-balanced binary tree is defined as a binary tree in which the depth of the two subtrees of every node never differ by more than 1.

The same question appears on <Cracking> too (p86 4.1)

The key points to this problem is:

a. check whether the current tree is balanced
i.e., |height(left sub tree) - height(right sub tree)| <= 1

b. check whether the left sub tree and right sub tree of the current tree is balanced
It could be the case that even though the left, right height of the current tree differs no more
than one, the two trees are not balanced by themselves. e.g. the left sub tree could only
have left child and is an array...>__<

c. what is the height of the current tree ?
height of the current tree is the maximum height of left and right sub tree + 1
(even by wording, it is recursive in nature haha)

Something worth noticing:

a. since there are recursions in the process, it could lead to duplicates recursion.
notice that when you are calculating the height of the current tree, you actually already
calculate the height of its left and right sub trees. Use a DP :)

class TreeNode:
    def __init__(self, x):
        self.val = x
        self.left = None
        self.right = None

class Solution:
    # @param root, a tree node
    # @return a boolean
    def __init__(self):
        self.dp = {}

    def height(self, root):
        if root is None:
            return 0
        else:
            lh, rh = self.height(root.left), self.height(root.right)
            rh = max(lh, rh) + 1
            self.dp[root.left], self.dp[root.right] =lh, rh
        return h

    def isBalanced(self, root):
        if root is None:
            return True
        else:
            # this checks 1.whether is current tree is balanced; 2. whether its two sub trees are balanced
            return abs(self.height(root.left)-self.height(root.right)) <= 1 and self.isBalanced(root.left) and self.isBalanced(root.right)

Friday, June 27, 2014

Find the middle node in a singly linked list

Find the mid point (towards the end):

sample : 1-->2-->3 return 2 (for odd length of singly linked list,return the mid point)

sample : 1-->2-->3-->4 return 3 (for even length of singly linked list, return the first node of second-evenly-divided half)

Obviously you need to use runner's technique:

def findMid(head):
   if head == None or head.next == None:
        return head
   slow, fast = head, head
   while fast != None and fast.next != None:
       fast, slow = fast.next, slow.next
       if fast.next == None:
           break
       else:
           fast = fast.next

Find the mid point's previous node:

sample : 1-->2-->3 return 2 (for odd length of singly linked list,return the mid point)

sample : 1-->2-->3-->4 return 2 (for even length of singly linked list, return the last node of first-evenly-divided half)

Sometimes you want to divide a singly linked list into two halves and normally in this case, dividing means that you need to break the list into half, so it is crucial to get the node that is previous to the head node of the second half.

In the above case, after you get 2 from the first and second half sample, you need to set
prev.next = None

Until this is set, the break could be considered completed. On top of runner technique, you want to add a dummy node to the original head of the singly linked list, so that it returns the "previous node"

def findMid(head):
   if head == None or head.next == None:
        return head
   dummy = ListNode(0)
   dummy.next = head
   slow, fast = dummy, dummy
   while fast != None and fast.next != None:
       fast, slow = fast.next, slow.next
       if fast.next == None:
           break
       else:
           fast = fast.next

Reverse Singly LInked List

Very Simple -- but what to notice is that you have to think at least two steps!! Sometimes, if you just think one step, it is the special cases :)

class ListNode:
 def __init__(self,val):
  self.val = val
  self.next = None


def reverseLinkedList(head):
 if head == None or head.next == None:
  return head
 dummy = ListNode(0)
 dummy.next = head
 cur = head.next
 while cur != None:
  newcur = cur.next
  tstart = dummy.next
  dummy.next = cur
  cur.next = tstart
  head.next = newcur
  cur = newcur

 return dummy.next

## test block
a = ListNode(1)
b = ListNode(2)
c = ListNode(3)
d = ListNode(4)
a.next = b
b.next = c
c.next = d

x = reverseLinkedList(a)
while x is not None:
 print(x.val)
 x = x.next

This is the way that uses dummy head, but actually you dont really need the dummy, in this iterative function, the only nodes that needed to be tracked is the new head and the current node, which is the one to be added to the front, and until current is None, the traversal through the entire list is done, and the you return the current new head, and that's it!

def reverseLinkedList(head):
   if head == None or head.next == None:
       return head
   # last is the new head, and also is the real 'last' node of the current node, it has a meaning :)
   last = haed
   current = head.next
   while current is not None:
       head.next = current.next
       current.next = last
       last = current
       current = head.next
   return last

The above code is also "in-place" reverse. maybe it would be a bit concise :)

Thursday, June 26, 2014

Tail Recursion & Iterative way to traversal a binary tree (inorder)

First of all, introduce the recursive way to do in-order traversal of a binary tree, which is trivial XD.

The following function is to flatten a binary tree into an array that is traversed in-order.

def inorder(root):
    if root == None:
       return []
    else:
       lt = inorder(root.left) 
       rt = inorder(root.right)
       return lt + [root.val] + rt

This is the standard way to do traversal on a binary tree, the same technique could be used to do pre-/post-traversal as well. However, what could we do if recursion is not allowed? Now let's introduce one phenomenon called 'Tail Recursion'

Here is a good resource to learn more about tail recursion

Tail Recursion means that in the end of a recursion function, it jumps back to its start with arguments reduced/shrinks.

Example that is NOT a Tail Recursion:

def factorial(n):
   if n == 1:
      return 1
   else:
      return n*factorial(n-1)

Even though that the recursion happens at the end of the function, it is not PURELY the recursion itself, so it is NOT Tail Recursion.

This definition is not tail-recursive since the recursive call to factorial is not the last thing in the function (its result has to be multiplied by n)

Example that is a Tail Recursion:

def factorial(n, accumulator):
   if n == 1:
       return accumulator
   else:
       return factorial(n-1, n * accumulator)

This is definitely a Tail Recursion, since the last thing in the function is nothing but goes back to the function itself. Tail Recursion could be easily converted to a 'while' loop.

def factorial(n, accumulator):
    c = n
    while c != 1:
       accumulator *= c
    return accumulator

very intuitive, or?

The reason why sometimes iteration is better than recursion is because:

Recursive procedures call themselves to work towards a solution to a problem. In simple implementations this balloons the stack as the nesting gets deeper and deeper, reaches the solution, then returns through all of the stack frames. This waste is a common complaint about recursive programming in general.

Let's look back at recursion of inorder traversal:to be continued

Monday, June 23, 2014

SQL HACKS

SQL Zoo Practice

HACKS :

1. Running Total - Theta Join
you can join two tables with 'on a.col1 >= b.col1' -- this is called theta join

2. Cross Join
This is when you want to create combinations and need to generate Cartesian join in the first place

3. Simple CASE WHEN clause

4. Left Outer Joins for null values

5. Add col C that shows the bigger of col A and col B

6. Pivot!!!!!!!!!
To display rows as columns and columns as rows - using CASE WHEN(or left outer join) & Union
In the first query , there is a MAX in the clause, it is because group by needs either an aggregating function in the select statement or to use these col in group by clause!

7. Queries and web pages

a. AdventureWorks Schema

CustomerAW(CustomerID, FirstName, MiddleName, LastName, CompanyName, EmailAddress)

CustomerAddress(CustomerID, AddressID, AddressType)

Address(AddressID, AddressLine1, AddressLine2, City, StateProvince, CountyRegion, PostalCode)

SalesOrderHeader(SalesOrderID, RevisionNumber, OrderDate, CustomerID, BillToAddressID,
ShipToAddressID, ShipMethod, SubTotal, TaxAmt, Freight)

SalesOrderDetail(SalesOrderID, SalesOrderDetailID, OrderQty, ProductID, UnitPrice,
UnitPriceDiscount)

ProductAW(ProductID, Name, Color, ListPrice, Size, Weight, ProductModelID, ProductCategoryID)

ProductModel(ProductModelID, Name)

ProductCategory(ProductCategoryID, ParentProductCategoryID, Name)

ProductModelProductDescription(ProductModelID, ProductDescriptionID, Culture)

ProductDescription(ProductDescriptionID, Description)

above schema detail could be found HERE

Medium Questions -- Details and playground can be found HERE
(better type queries into playground field to see the result)

6. A "Single Item Order" is a customer order where only one item is ordered. Show the SalesOrderID and the UnitPrice for every Single Item Order.

select s.SalesOrderID,p.ListPrice UnitPrice
from ProductAW p inner join SalesOrderDetail s
on p.ProductID = s.ProductID
group by s.SalesOrderID
having count(s.ProductID) = 1

7. Where did the racing socks go? List the product name and the CompanyName for all Customers who ordered ProductModel 'Racing Socks'.

select p.Name,c.CompanyName
from CustomerAW c inner join SalesOrderHeader s
on c.CustomerID = s.CustomerID
inner join SalesOrderDetail sd on sd.SalesOrderID = s.SalesOrderID
inner join ProductAW p on p.ProductID = sd.ProductID
inner join ProductModel pm on pm.ProductModelID = p.ProductModelID
where pm.Name = 'Racing Socks'

8.Show the product description for culture 'fr' for product with ProductID 736.

select pd.Description
from ProductDescription pd inner join ProductModelProductDescription pmpd
on pmpd.ProductDescriptionID = pd.ProductDescriptionID
inner join ProductAW p on p.ProductModelID = pmpd.ProductModelID
where pmpd.Culture = 'fr' and p.ProductID = 736

9. Use the SubTotal value in SaleOrderHeader to list orders from the largest to the smallest. For each order show the CompanyName and the SubTotal and the total weight of the order.

select c.CompanyName, s.SubTotal, s.Freight
from CustomerAW c inner join SalesOrderHeader s
on c.CustomerID = s.CustomerID
order by s.SubTotal DESC

10. How many products in ProductCategory 'Cranksets' have been sold to an address in 'London'?

select count(distinct(sd.ProductID)) as totalNumber
from SalesOrderHeader sh inner join Address a
on a.AddressID = sh.ShipToAddressID
inner join SalesOrderDetail sd on sd.SalesOrderID=sh.SalesOrderID
inner join ProductAW p on p.ProductID = sd.ProductID
inner join ProductCategory pc on pc.ProductCategoryID = p.ProductCategoryID
where a.City = 'London' and pc.Name = 'Cranksets'

Hard Questions -- Details and playground can be found HERE

11.For every customer with a 'Main Office' in Dallas show AddressLine1 of the 'Main Office' and AddressLine1 of the 'Shipping' address - if there is no shipping address leave it blank. Use one row per customer.

select c.CompanyName,ma.AddressLine1 'Main Office Address',

case when s.AddressType = 'Shipping' then sa.AddressLine1 else '' end as 'Shipping Address'
from CustomerAW c, CustomerAddress m, CustomerAddress s, Address ma, Address sa
where c.CustomerID = m.CustomerID and ma.AddressID = m.AddressID and c.CustomerID = s.CustomerID and s.AddressID = sa.AddressID and m.AddressType = 'Main Office' and ma.City = 'Dallas'

12. For each order show the SalesOrderID and SubTotal calculated three ways:
A) From the SalesOrderHeader
B) Sum of OrderQty*UnitPrice
C) Sum of OrderQty*ListPrice

select sh.SalesOrderID,sh.SubTotal, (sd.OrderQty*sd.UnitPrice) as b, (sd.OrderQty*p.ListPrice) as c
from SalesOrderHeader sh, SalesOrderDetail sd,
ProductAW p
where sh.SalesOrderID = sd.SalesOrderID and p.ProductID = sd.ProductID
group by sh.SalesOrderID

13. Show the best selling item by value.

select p.Name, sum(sd.OrderQty) as 'sales amout'
from SalesOrderDetail sd, ProductAW p
where sd.ProductID = p.ProductID
group by p.ProductID
order by sum(sd.OrderQty) DESC

14. Show how many orders are in the following ranges (in $) like the format below:

    RANGE      Num Orders      Total Value
    0-  99
  100- 999
 1000-9999
10000-

select t.range as RANGE, count(t.range) as 'Num Orders', sum(t.SubTotal) as 'Total Value' from (select case
when SubTotal between 0 and 99 then '0-99' when SubTotal between 100 and 999 then '100-999'
when SubTotal between 1000 and 9999 then '1000-9999'
else '10000-' end as range, SubTotal from SalesOrderHeader) t
group by t.range

THIS might be helpful to solve #14

15. Identify the three most important cities. Show the break down of top level product category against city.

Saturday, May 24, 2014

Sharding VS Partition

今天在网上学习NoSQL的时候发现老出现partition和sharding这两个关键字, 知道他们的大致意思是把数据分分开, 但是不知道具体是什么区别, 看了quora的一个回答, 感觉终于有点明了了(http://www.quora.com/Whats-the-difference-between-sharding-and-partition).

所谓pairition -- 分区, 是一个更general的动词, 就是把"数据分成几份" 的意思,其中有两种类型,第一种是 "horizontal partition" 水平分区 -- 也叫sharding, 另外一个叫做 "vertical partition" 垂直分区.

horizontal partition 意思很明显, 就是把一个很大的 , 在同一个table里面的数据 (在NoSQL里面应该说是同一个collection里面的documents) 按照row 来分成几个部分, 一般来说, 水平分区的时候会按照一个shard key来进行分区, 什么意思呢? 就比如我有一个table, 里面的PK是customerID, 最简单的利用shard key来分区的方法,就是把ID分成不同份的range来区分, 比如shard 1 是ID 从1-10000, shard 2 是ID从10001- 20000, 这样查找起来数据, 这样是最intuitive,当然DBA在分区的时候还会考虑更多的吗比如那些数据是经常会被一起查找的, 给予query的诉求,会为了减少I/O而把一部分常常一起作为结果的放在一个shard里面.

vertical partition 就是将一个表垂直分开,分成左右两部分.有什么应用呢? 比如,我有一个table, 里面是customerID以及若干个address, 一般来说我只需要一个biling address, 可是我不想lose track of other address, 为了方便, 我把主表和其他地址分成俩个地方储存, 用key 连接.

为什么我上面不直接把horizontal partition 叫sharding呢,其实两者还是有一定区别的, 有一个总结的不错的表,直接贴过来了~
地址:http://like-eagle.iteye.com/blog/704629

Different Types of NoSQL Database

NoSQL stands for 'not only SQL'.

并不是说完全不用到SQL的思想, 只是query的方式不同, 不需要用到传统的SQL query来进行数据的读取,还有一个很重要的特点是, NoSQL DB 不用到传统意义上的join 来进行table之间的结合,也当然就没有FK之类的概念, PK在NoSQL里面基本上是以hash key的形式存在 ( 在key-value以及document db中作为一个field -- _id存在)
现在主流的NoSQL数据库类型有四种:

a. Key-Value storage : Oracle NoSQL DB

b. Document Oriented Database: MongoDB, CouchDB

c. Wide-Column Store: Hadoop/HBase, Cassandra

d. Graph Database: Neo4j

在学习了lynda上的counchDB教程以后 (笔记详见板块里的记录), 今天阅读了一下MongoDB的官方指南 < The little MongoDB book>, 下面摘录了一下关于document store 的一些特点! 以MongoDB为例,(与counchDB有所不同)

Document oriented database: CouchDB, mongodb

I will be focusing on mongoDB as example. There are 6 main components in mongoDB

1. instance: one instance in mongoDB is made up of one or more databases.

2. database: The concept of 'database' is similar to the one used in
traditional relational database.

3. collections: one database is made up of one or more collections. And a
collection is made up of one or more documents. As a reference, you can think a collection as a table in traditional relational db.

4. document: one document is made up of several fields. one document could be
compared as one row in traditional relational database.

5. fileds: fileds could be compared as columns in table, but it doesn't
necessarily to be the same even within one table -- 'one document', which makes it more flexible than traditional db.

6. indexes: indexes in mongodb is similar with indexes in RDBMS

7. cursor: when you ask mongodb for data, it returns curosr, which could do a
lot of things without actually pulling out data, things include counting and skipping ahead. ( to be studied.)

不得不说对cursor和index的理解都不是很深入,需要再好好看一下,这里留个伏笔~